Skip to main content

Binary Search Tree in DSA


In Data Structures and Algorithms (DSA), Trees

Trees are a fundamental data structure that help in efficiently organizing and managing hierarchical data. They are used in many applications like databases, file systems, search engines, and network routing, among others.

How Trees Help in DSA:

  • Efficient Searching: Binary search trees (BST) allow fast searching, insertion, and deletion of data in O(log n) time (on average), making it much faster than linear search on an array.
  • Hierarchical Data Representation: Trees help in representing hierarchical data, like file systems or organizational charts, where each node represents a unit, and the edges represent the relationship between them.
  • Balanced Trees: Variants like AVL trees or Red-Black trees maintain balance, ensuring efficient performance in searching and updates.
  • Priority Queue: Heaps (a type of binary tree) are used in priority queues, where elements are inserted or removed based on priority rather than order of arrival.
  • Tree Traversal: Traversal algorithms like in-order, pre-order, and post-order are widely used in many scenarios like expression evaluation, syntax parsing, etc.

Example: Binary Search Tree (BST) Implementation

Let’s see an example of a Binary Search Tree (BST) in Python, and how it helps in organizing and searching data efficiently.


class Node:
    def __init__(self, key):
        self.left = None
        self.right = None
        self.value = key

class BinarySearchTree:
    def __init__(self):
        self.root = None
    
    def insert(self, root, key):
        if root is None:
            return Node(key)
        if key < root.value:
            root.left = self.insert(root.left, key)
        else:
            root.right = self.insert(root.right, key)
        return root
    
    def search(self, root, key):
        if root is None or root.value == key:
            return root
        if root.value < key:
            return self.search(root.right, key)
        return self.search(root.left, key)

    def inorder(self, root):
        return self.inorder(root.left) + [root.value] + self.inorder(root.right) if root else []

# Example Usage
bst = BinarySearchTree()
root = None

# Inserting nodes into the BST
keys = [20, 8, 22, 4, 12, 10, 14]
for key in keys:
    root = bst.insert(root, key)

# Searching for a key
search_key = 10
found_node = bst.search(root, search_key)

if found_node:
    print(f"Node {search_key} found in the tree.")
else:
    print(f"Node {search_key} not found in the tree.")

# In-order Traversal (Sorted order for BST)
print("In-order Traversal:", bst.inorder(root))

        

Explanation:

  • Node Class: Each node contains a value, a reference to the left child, and a reference to the right child.
  • insert() Method: This method inserts a new node into the tree by comparing the value to be inserted with the root. If the value is less than the current node, it goes to the left; if it's greater, it goes to the right.
  • search() Method: This method searches for a given key in the BST. It compares the key with the root and recursively searches in the left or right subtree depending on whether the key is smaller or larger than the current node's value.
  • inorder() Method: This method returns the in-order traversal of the tree (left-root-right). For a BST, this traversal gives the elements in sorted order.

Output:


Node 10 found in the tree.
In-order Traversal: [4, 8, 10, 12, 14, 20, 22]

        

Real-World Applications of Trees:

  • File System: The hierarchy of directories and files is often represented using trees, where each folder is a node and its subfolders or files are child nodes.
  • Search Engines: Binary search trees, B-trees, or tries are used to efficiently store and search a large number of documents or URLs.
  • Network Routing: Routing tables are often represented as trees or graphs, with nodes representing routers and edges representing paths between them.
  • Expression Parsing: In compilers, abstract syntax trees (ASTs) are used to represent the structure of expressions and statements in source code.

Variants of Trees:

  • Binary Search Tree (BST): Allows efficient search and insertion, but it can become unbalanced.
  • AVL Tree: A balanced BST, ensures O(log n) time for search, insertion, and deletion.
  • Red-Black Tree: Another self-balancing binary search tree.
  • Heap: A specialized tree-based structure that satisfies the heap property and is used in priority queues.

Keyword Search in Paragraphs Using Binary Search Tree (BST)

Certainly! This example adapts a Binary Search Tree (BST) to search for keywords within paragraphs and indicate their positions in a text file. Characters are not converted into numbers or tokens—instead, each paragraph and keyword is treated as a plain string.

Approach

  1. Create a Binary Search Tree (BST)
    • Each node stores a keyword
    • The BST allows fast keyword lookup
  2. Search for Keywords in Paragraphs
    • Each keyword is searched within each paragraph
    • All matching positions are recorded
  3. Indicate Keyword Positions
    • If a keyword is found, its index positions are displayed
  4. Write Results to a File
    • Results are saved in a text file for later review

Example Python Code


class Node:
    def __init__(self, key):
        self.left = None
        self.right = None
        self.value = key

class BinarySearchTree:
    def __init__(self):
        self.root = None
    
    def insert(self, root, key):
        if root is None:
            return Node(key)
        if key < root.value:
            root.left = self.insert(root.left, key)
        else:
            root.right = self.insert(root.right, key)
        return root
    
    def search(self, root, key):
        if root is None or root.value == key:
            return root
        if root.value < key:
            return self.search(root.right, key)
        return self.search(root.left, key)

def search_keyword_in_paragraph(paragraph, keyword):
    positions = []
    start = 0
    while start < len(paragraph):
        start = paragraph.find(keyword, start)
        if start == -1:
            break
        positions.append(start)
        start += len(keyword)
    return positions

def write_to_file(file_name, results):
    with open(file_name, 'w') as file:
        for result in results:
            file.write(result + '\n')

def main():
    paragraphs = [
        "This is the first paragraph. It contains some words.",
        "The second paragraph has some different words.",
        "Here is the third one. Keywords are fun to search.",
        "The fourth paragraph also has the word search."
    ]
    
    keywords = ["search", "words", "fun"]
    
    bst = BinarySearchTree()
    root = None
    for keyword in keywords:
        root = bst.insert(root, keyword)
    
    results = []
    for i, paragraph in enumerate(paragraphs):
        result = f"Paragraph {i+1}: {paragraph}\n"
        for keyword in keywords:
            if bst.search(root, keyword):
                positions = search_keyword_in_paragraph(paragraph, keyword)
                if positions:
                    result += f"Keyword '{keyword}' found at positions: {positions}\n"
                else:
                    result += f"Keyword '{keyword}' not found in this paragraph.\n"
        results.append(result)
    
    write_to_file("keyword_search_results.txt", results)

if __name__ == "__main__":
    main()

            

Explanation of the Code

  • Node Class: Stores a keyword and references to left and right child nodes
  • BinarySearchTree Class: Handles insertion and searching of keywords
  • search_keyword_in_paragraph(): Finds all positions of a keyword in a paragraph
  • write_to_file(): Saves results to a text file
  • main(): Controls the program flow and ties everything together

Example Input


Paragraph 1: This is the first paragraph. It contains some words.
Paragraph 2: The second paragraph has some different words.
Paragraph 3: Here is the third one. Keywords are fun to search.
Paragraph 4: The fourth paragraph also has the word search.

            

Example Keywords

  • search
  • words
  • fun

Example Output (keyword_search_results.txt)


Paragraph 1: This is the first paragraph. It contains some words.
Keyword 'search' not found in this paragraph.
Keyword 'words' found at positions: [41]
Keyword 'fun' not found in this paragraph.

Paragraph 2: The second paragraph has some different words.
Keyword 'search' not found in this paragraph.
Keyword 'words' found at positions: [47]
Keyword 'fun' not found in this paragraph.

Paragraph 3: Here is the third one. Keywords are fun to search.
Keyword 'search' found at positions: [56]
Keyword 'words' not found in this paragraph.
Keyword 'fun' found at positions: [35]

Paragraph 4: The fourth paragraph also has the word search.
Keyword 'search' found at positions: [50]
Keyword 'words' not found in this paragraph.
Keyword 'fun' not found in this paragraph.

            

Summary

  • BST enables fast keyword lookup
  • String-based searching preserves original text structure
  • Keyword positions are clearly tracked and stored
  • Useful for text analysis, document scanning, and indexing

 

Further Reading

  1.  

People are good at skipping over material they already know!

View Related Topics to







Contact Us

Name

Email *

Message *

Popular Posts

Rayleigh vs Rician Fading (with MATLAB + Simulator)

  In Rayleigh fading , the channel coefficients tend to have a Rayleigh distribution, which is characterized by a random phase and magnitude with an exponential distribution. This means the magnitude of the channel coefficient follows an exponential distribution with a mean of 1. In Rician fading , there is a dominant line-of-sight component in addition to the scattered components. The channel coefficients in Rician fading can indeed tend towards 1, especially when the line-of-sight component is strong. When the line-of-sight component dominates, the Rician fading channel behaves more deterministically, and the channel coefficients may tend towards the value of the line-of-sight component, which could be close to 1.   MATLAB Script clc; clear all; close all; % Define parameters numSamples = 1000; % Number of samples K_factor = 5; % K-factor for Rician fading SNR_dB = 20; % Signal-to-noise ratio (in dB) % Generate complex Gaussian random variable for Rayleigh fading channel h_r...

BER vs SNR for M-ary QAM, M-ary PSK, QPSK, BPSK, ...(MATLAB Code + Simulator)

Bit Error Rate (BER) & SNR Guide Analyze communication system performance with our interactive simulators and MATLAB tools. 📘 Theory 🧮 Simulators 💻 MATLAB Code 📚 Resources BER Definition SNR Formula BER Calculator MATLAB Comparison 📂 Explore M-ary QAM, PSK, and QPSK Topics ▼ 🧮 Constellation Simulator: M-ary QAM 🧮 Constellation Simulator: M-ary PSK 🧮 BER calculation for ASK, FSK, and PSK 🧮 Approaches to BER vs SNR What is Bit Error Rate (BER)? The BER indicates how many corrupted bits are received compared to the total number of bits sent. It is the primary figure of merit for a...

UGC-NET Electronic Science Question Paper With Answer Key and Full Explanation [Dec 2023]

    UGC-NET Electronic Science Question Paper With Answer Key Download Pdf [Dec 2023] Download Question Paper               See Answers   2025 | 2024 | 2023 | 2022 | 2021 | 2020 UGC-NET Electronic Science  2023 Answers with Explanations 51. (A): The stacking fault is the most common area defect found in silicon. These faults typically occur along the 111 plane. In the crystalline structure of silicon, atoms are arranged in a specific pattern known as a diamond lattice. A stacking fault refers to a disruption in the normal order of atomic layers within this lattice, which usually occurs in the 111 plane due to the geometric arrangement of the atoms. This type of defect can affect the electrical and mechanical properties of the material, such as the mobility of charge carriers and mechanical strength. 52. (C): The important figure of merit for the microwave application of a Schot...

Theoretical vs. simulated BER vs. SNR for ASK, FSK, and PSK (MATLAB Code + Simulator)

📘 Overview 🧮 Simulator 💻 Theoretical Code 📊 Simulated Code 📚 Resources Overview BER vs. SNR denotes how many bits in error are received for a given signal-to-noise ratio, typically measured in dB. Common noise types in wireless systems: 🚀 1. Additive White Gaussian Noise (AWGN) 🌊 2. Rayleigh Fading AWGN adds random noise; Rayleigh fading attenuates the signal variably. A good SNR helps reduce these effects. Bit Error Rate (BER) Equations BER formulas for ASK, FSK, and PSK modulation schemes. ASK BER = 0.5 × erfc(0.5 × √SNR) FSK BER = 0.5 × erfc(√(SNR / 2)) PSK BER = 0.5 × erfc(√SNR) erfc / Q-function (Click here) Live BER S...

How to use MATLAB Simulink

Introduction to MATLAB Simulink MATLAB Simulink is a popular add-on of MATLAB. Here, you can use different blocks like modulator, demodulator, AWGN channel, etc. And you can do experiments on your own. Steps to Get Started 1. Go to the 'Simulink' tab at the top navbar of MATLAB. If not found, click on the add-on tab, search 'Simulink,' and then click on it to add. 2. Once you installed the simulation, click the 'new' tap at the top left corner. 3. Then, search the required blocks in the 'Simulink library.' Then, drag it to the editor space. 4. You can double-click on the blocks to see the input parameters. 5. Then, connect the blocks by dragging a line from one block's output terminal to another block's input. 6. If the connection is complete, click the 'run' tab in the middle of the top navbar. 7. After clicking on the run ...

UGC NET Electronic Science Previous Year Question Papers

Home / Engineering & Other Exams / UGC NET 2022 PYQ 📥 Download UGC NET Electronics PDFs Complete collection of previous year question papers, answer keys and explanations for Subject Code 88. Start Downloading UGC-NET (Electronics Science, Subject code: 88) Subject_Code : 88; Department : Electronic Science; 📂 View All Question Papers Q. UGC Net Electronic Science Question Paper [June 2025] A. UGC Net Electronic Science Question Paper With Answer Key Download Pdf [June 2025] with full explanation Q. UGC Net Electronic Science Question Paper [December 2024] A. UGC Net Electronic Science Question Paper With Answer Key Download Pdf [December 2024] Q. UGC Net Electronic Science Question Paper [Aug 2024] A. UGC Net Electronic Scien...

OFDM Waveform with MATLAB Code (with Simulator)

  In OFDM (Orthogonal Frequency Division Multiplexing) , we transmit multiple orthogonal subcarriers simultaneously. Since the subcarriers are orthogonal , they do not interfere with each other, which is one of the main advantages of OFDM. Practically, OFDM converts a wideband signal into multiple narrowband orthogonal subcarriers. For typical wireless communication, if the signal bandwidth (or symbol duration) exceeds the coherence bandwidth of the channel, the signal experiences frequency-selective fading . Fading distorts the signal, making it difficult to recover the original information. By using OFDM, we transmit the same wideband signal across multiple orthogonal narrowband subcarriers, reducing the effect of fading. For example, if we want to transmit a signal of bandwidth 1024 kHz , we can divide it into N = 8 subcarriers . Each subcarrier is then spaced by: Δf = Total Bandwidth N = 1024 8 kHz...

Constellation Diagrams of ASK, PSK, and FSK (with MATLAB Code + Simulator)

Constellation Diagrams: ASK, FSK, and PSK Comprehensive guide to signal space representation, including interactive simulators and MATLAB implementations. 📘 Overview 🧮 Simulator ⚖️ Theory 📚 Resources Definitions Constellation Tool Key Points MATLAB Code 📂 Other Topics: M-ary PSK & QAM Diagrams ▼ 🧮 Simulator for M-ary PSK Constellation 🧮 Simulator for M-ary QAM Constellation BASK (Binary ASK) Modulation Transmits one of two signals: 0 or -√Eb, where Eb​ is the energy per bit. These signals represent binary 0 and 1. BFSK (Binary FSK) Modulation Transmits on...