Array Implementation of Binary Trees
To avoid the cost of all the shifts in memory that we get from using Arrays, it is useful to implement Binary Trees with pointers from one element to the next, especially when the Binary Tree is modified often.
However, if a Binary Tree is read much more frequently than it is modified, an Array implementation can make sense. It requires less memory, is easier to implement, and can be faster for certain operations due to cache locality.
Cache Locality
Cache locality refers to how modern CPUs optimize memory access. When a memory location is accessed, nearby memory locations are often loaded into the CPU cache as well.
Because array elements are stored contiguously in memory, reading from arrays is often faster. When one element is accessed, the next elements are likely already cached and ready for use in the next CPU cycle.
How Binary Trees Are Stored in Arrays
Consider the following Binary Tree:
R
/ \
A B
/ \ / \
C D E F
\
G
This Binary Tree can be stored in an array by placing the root node R
at index 0. For any node stored at index i:
- Left child index =
2 * i + 1 - Right child index =
2 * i + 2
Array Representation
binary_tree_array = [
'R', 'A', 'B', 'C', 'D', 'E', 'F',
None, None, None, None, None, None,
'G'
]
Index Helper Functions
def left_child_index(index):
return 2 * index + 1
def right_child_index(index):
return 2 * index + 2
def get_data(index):
if 0 <= index < len(binary_tree_array):
return binary_tree_array[index]
return None
Example Access
right_child = right_child_index(0)
left_child_of_right_child = left_child_index(right_child)
data = get_data(left_child_of_right_child)
print("root.right.left.data:", data)
This example shows how node relationships are determined purely through index calculations instead of pointers.
Why Empty Array Slots Are Needed
When a node does not have a child, its position in the array must still exist
as None. This ensures that index calculations remain valid.
Because of this, array-based Binary Trees work best when the tree is perfect or nearly perfect.
Perfect Binary Tree
A perfect Binary Tree has:
- Every internal node with exactly two children
- All leaf nodes on the same level
R
/ \
A B
/ \ / \
C D E F
binary_tree_array = ['R', 'A', 'B', 'C', 'D', 'E', 'F']
This representation avoids wasting space on empty array elements.
Depth-First Traversals Using Arrays
Even though the tree is stored in an array, tree traversals work the same way as pointer-based implementations — using recursion.
Traversal Code
binary_tree_array = [
'R', 'A', 'B', 'C', 'D', 'E', 'F',
None, None, None, None, None, None,
'G'
]
def left_child_index(index):
return 2 * index + 1
def right_child_index(index):
return 2 * index + 2
def pre_order(index):
if index >= len(binary_tree_array) or binary_tree_array[index] is None:
return []
return (
[binary_tree_array[index]] +
pre_order(left_child_index(index)) +
pre_order(right_child_index(index))
)
def in_order(index):
if index >= len(binary_tree_array) or binary_tree_array[index] is None:
return []
return (
in_order(left_child_index(index)) +
[binary_tree_array[index]] +
in_order(right_child_index(index))
)
def post_order(index):
if index >= len(binary_tree_array) or binary_tree_array[index] is None:
return []
return (
post_order(left_child_index(index)) +
post_order(right_child_index(index)) +
[binary_tree_array[index]]
)
print("Pre-order Traversal:", pre_order(0))
print("In-order Traversal:", in_order(0))
print("Post-order Traversal:", post_order(0))
The recursive logic is identical to pointer-based trees — the only difference is how child nodes are accessed.
Summary
- Array-based Binary Trees eliminate pointers and use index math instead
- They benefit from cache locality and reduced memory overhead
- They work best for perfect or nearly perfect trees
- Binary heaps are the most common real-world use case
- Pointer-based trees are better for sparse or frequently modified trees