Array Implementation of Binary Trees

To avoid the cost of all the shifts in memory that we get from using Arrays, it is useful to implement Binary Trees with pointers from one element to the next, especially when the Binary Tree is modified often.

However, if a Binary Tree is read much more frequently than it is modified, an Array implementation can make sense. It requires less memory, is easier to implement, and can be faster for certain operations due to cache locality.

Cache Locality

Cache locality refers to how modern CPUs optimize memory access. When a memory location is accessed, nearby memory locations are often loaded into the CPU cache as well.

Because array elements are stored contiguously in memory, reading from arrays is often faster. When one element is accessed, the next elements are likely already cached and ready for use in the next CPU cycle.

How Binary Trees Are Stored in Arrays

Consider the following Binary Tree:


        R
      /   \
     A     B
    / \   / \
   C   D E   F
                \
                 G

This Binary Tree can be stored in an array by placing the root node R at index 0. For any node stored at index i:

Left child index = 2 * i + 1
Right child index = 2 * i + 2

Array Representation


binary_tree_array = [
  'R', 'A', 'B', 'C', 'D', 'E', 'F',
  None, None, None, None, None, None,
  'G'
]

Index Helper Functions


def left_child_index(index):
    return 2 * index + 1

def right_child_index(index):
    return 2 * index + 2

def get_data(index):
    if 0 <= index < len(binary_tree_array):
        return binary_tree_array[index]
    return None

Example Access


right_child = right_child_index(0)
left_child_of_right_child = left_child_index(right_child)
data = get_data(left_child_of_right_child)

print("root.right.left.data:", data)

This example shows how node relationships are determined purely through index calculations instead of pointers.

Why Empty Array Slots Are Needed

When a node does not have a child, its position in the array must still exist as None. This ensures that index calculations remain valid.

Because of this, array-based Binary Trees work best when the tree is perfect or nearly perfect.

Perfect Binary Tree

A perfect Binary Tree has:

Every internal node with exactly two children
All leaf nodes on the same level


        R
      /   \
     A     B
    / \   / \
   C   D E   F


binary_tree_array = ['R', 'A', 'B', 'C', 'D', 'E', 'F']

This representation avoids wasting space on empty array elements.

Depth-First Traversals Using Arrays

Even though the tree is stored in an array, tree traversals work the same way as pointer-based implementations — using recursion.

Traversal Code


binary_tree_array = [
    'R', 'A', 'B', 'C', 'D', 'E', 'F',
    None, None, None, None, None, None,
    'G'
]

def left_child_index(index):
    return 2 * index + 1

def right_child_index(index):
    return 2 * index + 2

def pre_order(index):
    if index >= len(binary_tree_array) or binary_tree_array[index] is None:
        return []
    return (
        [binary_tree_array[index]] +
        pre_order(left_child_index(index)) +
        pre_order(right_child_index(index))
    )

def in_order(index):
    if index >= len(binary_tree_array) or binary_tree_array[index] is None:
        return []
    return (
        in_order(left_child_index(index)) +
        [binary_tree_array[index]] +
        in_order(right_child_index(index))
    )

def post_order(index):
    if index >= len(binary_tree_array) or binary_tree_array[index] is None:
        return []
    return (
        post_order(left_child_index(index)) +
        post_order(right_child_index(index)) +
        [binary_tree_array[index]]
    )

print("Pre-order Traversal:", pre_order(0))
print("In-order Traversal:", in_order(0))
print("Post-order Traversal:", post_order(0))

The recursive logic is identical to pointer-based trees — the only difference is how child nodes are accessed.

Summary

Array-based Binary Trees eliminate pointers and use index math instead
They benefit from cache locality and reduced memory overhead
They work best for perfect or nearly perfect trees
Binary heaps are the most common real-world use case
Pointer-based trees are better for sparse or frequently modified trees

Search This Blog