Tree Data Structures: Binary Trees, BSTs, and Traversal Algorithms
Trees are everywhere in software: file systems, databases, compilers, and HTML itself. Understanding tree terminology, the binary search tree invariant, and the three fundamental traversal orders gives you a solid handle on a huge class of algorithms.
Published June 30, 2026Core Terminology
A tree is a connected, acyclic data structure made of nodes connected by edges. Every tree has exactly one root node at the top. Each node may have zero or more children; a node with no children is called a leaf. A node's parent is the node directly above it. The depth of a node is its distance from the root; the height of a tree is the longest path from root to any leaf. A subtree is any node together with all of its descendants.
Binary Trees and the BST Invariant
A binary tree restricts each node to at most two children, conventionally called left and right. A binary search tree (BST) adds one rule on top of that: for any node N, every value in N's left subtree is strictly less than N's value, and every value in N's right subtree is strictly greater. This invariant is what makes search efficient.
Here is a minimal BST node and two core operations in Python:
class BSTNode:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
def bst_insert(root, value):
if root is None:
return BSTNode(value)
if value < root.value:
root.left = bst_insert(root.left, value)
elif value > root.value:
root.right = bst_insert(root.right, value)
return root # duplicate values are ignored here
def bst_search(root, target):
if root is None:
return False
if target == root.value:
return True
if target < root.value:
return bst_search(root.left, target)
return bst_search(root.right, target)
Both insert and search run in O(h) time where h is the height of the tree. In a balanced tree h = O(log n), giving you O(log n) operations. In a degenerate (completely unbalanced) tree h = O(n), degrading to linear search. This distinction is why balanced tree variants matter so much in practice. For a refresher on why O(log n) is so desirable, see our big-O notation guide.
The Three Traversal Orders
Traversal means visiting every node exactly once. The three standard depth-first orderings differ only in when you visit the root relative to its children.
1. Inorder (Left → Root → Right)
- Recursively traverse the left subtree.
- Visit the current node (e.g., print its value).
- Recursively traverse the right subtree.
def inorder(node):
if node is None:
return
inorder(node.left)
print(node.value)
inorder(node.right)
# For a BST, inorder produces values in sorted ascending order.
2. Preorder (Root → Left → Right)
- Visit the current node.
- Recursively traverse the left subtree.
- Recursively traverse the right subtree.
def preorder(node):
if node is None:
return
print(node.value)
preorder(node.left)
preorder(node.right)
# Useful for copying a tree or serialising it, since the root comes first.
3. Postorder (Left → Right → Root)
- Recursively traverse the left subtree.
- Recursively traverse the right subtree.
- Visit the current node.
def postorder(node):
if node is None:
return
postorder(node.left)
postorder(node.right)
print(node.value)
# Useful for deletion (children before parent) and evaluating expression trees.
All three traversals use recursion, which relies on the call stack to track the path back up the tree. For very deep trees you can replace the implicit call stack with an explicit one to avoid stack-overflow risks. Our article on recursion explained covers the mechanics in detail.
Balanced Trees: AVL and Red-Black
A plain BST offers no guarantee of balance. If you insert values in sorted order (1, 2, 3, 4, 5...) you get a tree that looks like a linked list, with O(n) search. Self-balancing trees fix this automatically.
An AVL tree maintains the invariant that for every node, the heights of the left and right subtrees differ by at most one. After each insertion or deletion the tree performs rotations to restore balance. This guarantees O(log n) height at all times, but the frequent rotations add overhead.
A Red-Black tree uses a looser colouring rule (each node is red or black, with specific constraints on adjacent colours) that allows slightly more imbalance but requires fewer rotations on average. Red-Black trees are the choice in most production implementations: Java's TreeMap, C++ STL's std::map, and Linux's process scheduler all use them.
Heaps: Trees Stored in Arrays
A heap is a complete binary tree (every level fully filled except possibly the last, filled left to right) that satisfies the heap property. In a max-heap, every parent is greater than or equal to its children, so the maximum value is always at the root. A min-heap inverts this.
The neat trick with heaps is that a complete binary tree maps perfectly onto an array. If a parent lives at index i, its children are at 2i+1 and 2i+2. No pointers needed. This makes heaps very cache-friendly compared to pointer-based trees and is exactly why heapsort runs in-place. Heaps power priority queues, which in turn power Dijkstra's shortest-path algorithm, event simulation systems, and operating-system schedulers. We compare heaps to other sorting tools in our common data structures overview.