Quick reference

A tree organizes values hierarchically.

A tree of animals and their classes. The root node is the Animal kingdom, and its children are the classes Reptile and Mammal. These classes are further split up into specific types such as Lizards, Snakes, Birds, etc.

Each entry in the tree is called a node, and every node links to zero or more child nodes.

If you flip the picture upside down, it kind of looks like a tree. That's where the name comes from!

Example uses:

  • Filesystems—files inside folders inside folders
  • Comments—comments, replies to comments, replies to replies
  • Family trees—parents, grandparents, children, and grandchildren

Leaves, Depth, and Height

Leaf nodes are nodes that're on the bottom of the tree (more formally: nodes that have no children).

Each node in a tree has a depth: the number of links from the root to the node.

A tree's height is the number of links from its root to the furthest leaf. (That's the same as the maximum node depth.)

A tree with a height of 4. Each depth is labeled, starting with the root at depth 0, then the next level at depth 1, and so on until the last level at depth 4.

Tree Traversals

Breadth First Search (BFS)

In a BFS, you first explore all the nodes one step away, then all the nodes two steps away, etc..

Breadth-first search is like throwing a stone in the center of a pond. The nodes you explore "ripple out" from the starting point.

Here's a sample tree, with the nodes labeled in the order they'd be visited in a BFS.

A tree with a root node labeled 1. The root's left child is labeled 2 and its right is labeled 3. The left child of 2 is labeled 4 and the right is labeled 5. The left child of 3 is labeled 6 and the right 7. The left child of 5 is labeled 8 and the right is labeled 9.

Depth First Search (DFS)

In a DFS, you go as deep as possible down one path before backing up and trying a different one.

Depth-first search is like walking through a corn maze. You explore one path, hit a dead end, and go back and try a different one.

Here's a how a DFS would traverse the same example tree:

A tree with a root node labeled 1. The root's left child is labeled 2 and its right is labeled 7. The left child of 2 is labeled 3 and the right is labeled 4. The left child of 4 is labeled 5 and the right is labeled 6. The left child of 7 is labeled 8 and the right is labeled 9.

Comparing BFS and DFS

  • A BFS will find the shortest path between the starting point and any other reachable node. A depth-first search will not necessarily find the shortest path.
  • Depth-first search on a binary tree generally requires less memory than breadth-first.
  • Depth-first search can be easily implemented with recursion.

You can also use BFS and DFS on graphs.

Pre Order Traversal

Visit the current node, then walk the left subtree, and finally walk the right subtree.

A pre-order traversal usually visits nodes in the same order as a DFS.

A tree with the same nodes as the depth first search tree. The root node is labeled 1, its left child labeled 2 and its right labeled 7. The left child of 2 is labeled 3 and the right is labeled 4. The left child of 4 is labeled 5 and the right is labeled 6. The left child of 7 is labeled 8 and the right is labeled 9.

In Order Traversal

Walk the left subtree first, then visit the current node, and finally walk the right subtree.

Of all three traversal methods, this one is probably the most common. When walking a binary search tree, an in order traversal visits the nodes in sorted, ascending order.

A tree with a root node labeled 6. The root's left child is labeled 2 and its right is labeled 8. The left child of 2 is labeled 1 and the right is labeled 4. The left child of 4 is labeled 3 and the right is labeled 5. The left child of 8 is labeled 7 and the right is labeled 9.

Post Order Traversal

Walk the left subtree, then the right subtree, and finally visit the current node.

This one's kind of rare ... but it shows up in some parsing algorithms, like Reverse Polish Notation.

A tree with a root node labeled 9. The root's left child is labeled 5 and its right is labeled 8. The left child of 5 is labeled 1 and the right is labeled 4. The left of 4 is labeled 2 and the right is labeled 3. The left child of 8 is labeled 6 and the right is labeled 7.

Binary Trees

A binary tree is a tree where every node has at most two children.

The letters of the alphabet (up to Q) represented in a non-binary tree on the left and a binary tree on the right. The binary tree has at most two children per node whereas the non-binary tree has up to 8.

Full binary trees

A full binary tree is a binary tree where every node has exactly 0 or 2 children.

A binary tree with 6 nodes, each with either 0 or 2 children.

Perfect binary trees

A perfect binary tree doesn't have room for any more nodes—unless we increase the tree's height.

A binary tree where every node except the leaves has two children, and all the leaf nodes are at the same depth.

Complete binary trees

A complete binary tree is like a perfect binary tree missing a few nodes in the last level. Nodes are filled in from left to right.

Complete trees are the basis for heaps and priority queues.

A binary tree where every level except the last is completely filled and each node has either 0 or 2 children.

Balanced binary trees

A balanced binary tree is a tree whose height is small relative to the number of nodes it has. By small, we usually mean the height is , where n is the number of nodes.

Conceptually, a balanced tree "looks full," without any missing chunks or branches that end much earlier than other branches.

There are few different definitions of balanced depending on the context. One of the most common definition is that a tree is balanced if: (a) the heights of its left and right subtrees differ by at most 1, and (b) both subtrees are also balanced.

A balanced binary tree on the left, and an unbalanced binary tree on the right. The balanced tree looks full, with the subtrees to the left and right of the root node both ending at the same depth. The unbalanced tree has only one node to the left of the root and 3 to the right, ending at depth 3.

Similar definitions can be used for trees that have more than two children. For instance, a full ternary tree (with up to three children per node) is a tree where every node has zero or three children.

Relationship between height and number of nodes

In perfect binary trees there's a cool mathematical relationship between the number of nodes and the height of the tree.

First, there's a pattern to how many nodes are on each level:

  1. Level 0: 2^0=1 nodes,
  2. Level 1: 2^1=2 nodes,
  3. Level 2: 2^2=4 nodes,
  4. Level 3: 2^3=8 nodes,
  5. etc

Let's call the total number of nodes in the tree n, and the height of the tree h.

We could solve for n by adding up the number of nodes on each level in the tree:

n = 2^0 + 2^1 + 2^2 + 2^3 + ... + 2^{h-1} = 2^{h} - 1

Solving for h in terms of n, we get:

n = 2^{h} - 1 n + 1 = 2^{h} \log_{2}{((n+1))} = \log_{2}{(2^{h})} \log_{2}{(n+1)} = h

That's the relationship between a perfect binary tree's height and the number of nodes it has.

This is the intuition behind our definition of balanced that we used above. A perfect tree is balanced, and in a perfect tree the height grows logarithmically with the number of nodes.

. . .