How a Load Balancer Decides Which Server Handles Traffic
A load balancer decides which server should handle a request based on a traffic distribution algorithm. Its goal is to distribute traffic efficiently, prevent overload, and improve availability.
1. Basic Process
- A client sends a request to a website/app.
- The request first reaches the load balancer (not directly the servers).
- The load balancer selects one backend server using a predefined method.
- The request is forwarded to that server.
- The server responds back through the load balancer to the client.
2. Common Load Balancing Algorithms
Round Robin
Requests are distributed sequentially.
- Request 1 → Server A
- Request 2 → Server B
- Request 3 → Server C
- Request 4 → Server A (repeat)
Pros: Simple
Cons: Doesn’t consider server load
Weighted Round Robin
Each server has a weight based on capacity. Higher capacity servers get more traffic.
Least Connections
Sends request to the server with the fewest active connections. Good for long-lived sessions like WebSockets.
Weighted Least Connections
Combines server capacity and active connections.
IP Hash
Client IP is hashed. The same client IP usually goes to the same server. Useful for session persistence (sticky sessions).
Least Response Time
Chooses the server responding fastest by considering latency and active connections.
3. Health Checks (Very Important)
Load balancers continuously check:
- Is the server alive?
- Is it responding correctly?
- Is it overloaded?
If a server fails health checks, it is temporarily removed from traffic rotation.
4. Types of Load Balancers
Layer 4 (Transport Layer)
- Works at TCP/UDP level
- Faster
Layer 7 (Application Layer)
- Works at HTTP/HTTPS level
- Can route based on URL path (/api → Server A)
- Can use headers, cookies, and hostname
More intelligent but slightly heavier.
5. Real-World Examples
- Amazon Elastic Load Balancer (AWS service)
- NGINX (Web server and reverse proxy)
- HAProxy (Load balancer software)
Simple Analogy
Think of a load balancer like a restaurant host:
- Customers arrive.
- The host checks which table/server is free.
- Seats customers efficiently.
- If a waiter (server) is unavailable, the host skips them.