B-Tree: Definition and Python Example
What is a B-Tree?
A B-Tree is a self-balancing tree data structure that maintains sorted data and allows search, sequential access, insertions, and deletions in logarithmic time.
- Each node can have multiple keys.
- Designed for disk-based storage for efficiency.
- Commonly used in database indexes.
Formal definition: A B-Tree of order m is a tree where each node contains at most m-1 keys, has at most m children, all leaves are at the same depth, and keys within a node are sorted.
Why B-Trees Are Used in Databases
- Efficient for
range queries(>,<,BETWEEN) - Efficient for
exact lookups(=) - Can store many keys per node, reducing disk I/O
- Supports fast insertion and deletion while staying balanced
Practical Python Example
We can use the BTrees package:
# Install package if needed:
# pip install BTrees
from BTrees.OOBTree import OOBTree
# Create a B-Tree
btree = OOBTree()
# Insert key-value pairs
btree['ana@email.com'] = {"name": "Ana", "age": 25}
btree['bob@email.com'] = {"name": "Bob", "age": 30}
btree['carol@email.com'] = {"name": "Carol", "age": 28}
# Lookup by key (exact match)
user = btree.get('ana@email.com')
print("Exact search:", user)
# Range search (all users with email <= 'bob@email.com')
print("Range search:")
for key, value in btree.items(max='bob@email.com'):
print(key, value)
Output:
Exact search: {'name': 'Ana', 'age': 25}
Range search:
ana@email.com {'name': 'Ana', 'age': 25}
bob@email.com {'name': 'Bob', 'age': 30}
This demonstrates:
- Fast exact lookup (O(log n))
- Efficient range query
- Automatically keeps keys sorted
Mapping to Databases
-
btree['ana@email.com']≈SELECT * FROM users WHERE email='ana@email.com'; -
btree.items(max='bob@email.com')≈SELECT * FROM users WHERE email <= 'bob@email.com';