Hash Table
A Hash Table is a data structure designed to be fast to work with.
Hash Tables are often preferred over arrays or linked lists because searching, adding, and deleting data can be done very quickly, even for large amounts of data.
Why Not Arrays or Linked Lists?
In a Linked List, finding a person like "Bob" means checking
each node one by one until Bob is found.
In an Array, finding an element is fast only if we know its index. If we only know the value (like a name), we must compare each element.
A Hash Table avoids this by letting us go directly to the correct location using a hash function.
Building a Hash Table from Scratch
We will build a simple Hash Set to store unique names.
Step 1: Start with an Array
my_array = ['Pete', 'Jones', 'Lisa', 'Bob', 'Siri']
Searching this array for "Bob" requires checking elements one by one.
Instead, we create a fixed-size array of buckets:
my_hash_set = [None, None, None, None, None,
None, None, None, None, None]
Step 2: Storing Names Using a Hash Function
A hash function converts a value into an index number.
def hash_function(value):
sum_of_chars = 0
for char in value:
sum_of_chars += ord(char)
return sum_of_chars % 10
Example for "Bob":
- B → 66
- o → 111
- b → 98
Total = 275 → 275 % 10 = 5
So "Bob" is stored at index 5.
my_hash_set = [None, None, None, None, None,
'Bob', None, None, None, None]
Step 3: Looking Up a Name
To check if "Pete" exists:
- Run the hash function on "Pete"
- Get index 8
- Check bucket 8 directly
def contains(name):
index = hash_function(name)
return my_hash_set[index] == name
Step 4: Handling Collisions
A collision happens when two values get the same hash code.
Example:
- "Lisa" → index 3
- "Stuart" → index 3
Solution: Chaining (store multiple values in the same bucket).
my_hash_set = [
[None],
['Jones'],
[None],
['Lisa', 'Stuart'],
[None],
['Bob'],
[None],
['Siri'],
['Pete'],
[None]
]
Step 5: Complete Hash Set Example
def add(value):
index = hash_function(value)
bucket = my_hash_set[index]
if value not in bucket:
bucket.append(value)
def contains(value):
index = hash_function(value)
return value in my_hash_set[index]
Uses of Hash Tables
- Checking if an item exists in a collection
- Storing unique values
- Mapping keys to values (e.g., name → phone number)
Hash Tables are fast:
- Arrays / Linked Lists → O(n)
- Hash Tables (average) → O(1)
Hash Set vs Hash Map
| Hash Set | Hash Map |
|---|---|
| Stores only unique keys | Stores key-value pairs |
| Checks if something exists | Finds data using a key |
Summary
- Data is stored in buckets
- A hash function decides the bucket
- Collisions are normal and manageable
- Hash Tables are extremely fast
Conclusion: Hash Tables allow fast storage, lookup, and deletion by using a hash function to jump directly to data.