Load Balancing

What is Load Balancing?

Load balancing is the process of distributing network traffic across multiple servers to ensure no single server becomes overwhelmed. It improves application availability, reliability, and scalability by efficiently distributing incoming requests.

Why Load Balancing?

Problems It Solves:

Single Point of Failure: Eliminates dependency on one server
Performance Bottlenecks: Prevents server overload
Scalability Issues: Enables horizontal scaling
Maintenance Downtime: Allows rolling updates without service interruption
Geographic Latency: Routes users to nearest servers

Benefits:

High availability (HA)
Improved performance
Scalability and flexibility
Redundancy and fault tolerance
Efficient resource utilization

Types of Load Balancers

1. Hardware Load Balancers

Physical devices (F5 BIG-IP, Citrix ADC)
High performance but expensive
Dedicated processing power
Often used in enterprise data centers

2. Software Load Balancers

Open Source: HAProxy, NGINX, Apache mod_proxy
Commercial: NGINX Plus, Avi Networks
More flexible and cost-effective
Easy to deploy and update

3. Cloud Load Balancers

AWS: ELB (Classic, Application, Network, Gateway)
Google Cloud: Cloud Load Balancing
Azure: Azure Load Balancer, Application Gateway
Managed services with auto-scaling
Pay-per-use pricing model

Load Balancing Algorithms

1. Round Robin

Request 1 → Server A
Request 2 → Server B
Request 3 → Server C
Request 4 → Server A (cycle repeats)

Simple and predictable
Works well with servers of equal capacity
Doesn't consider server load

2. Weighted Round Robin

Server A (weight: 3) gets 3 requests
Server B (weight: 2) gets 2 requests
Server C (weight: 1) gets 1 request

Accounts for different server capacities
Manual weight configuration required

3. Least Connections

Routes to server with fewest active connections
Better for long-lived connections
Good for varied request processing times

4. Weighted Least Connections

Combines least connections with server weights
Considers both capacity and current load

5. IP Hash (Source IP Affinity)

server_index = hash(client_ip) % num_servers

Same client always goes to same server
Useful for session persistence
Can cause uneven distribution

6. Least Response Time

Routes to fastest-responding server
Considers both active connections and response time
Optimal for performance

7. Random

Randomly selects a server
Simple but unpredictable
Can work well with large server pools

8. Resource-Based

Monitors CPU, memory, bandwidth
Routes based on available resources
Requires health monitoring agents

OSI Layer Classification

Layer 4 (Transport Layer) Load Balancing

Works with: IP addresses and ports
Protocols: TCP/UDP
Speed: Faster, less CPU intensive
Features: Basic load distribution
Cannot inspect: Application data

Example configuration:

upstream backend {
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
}

Layer 7 (Application Layer) Load Balancing

Works with: HTTP headers, URLs, cookies
Protocols: HTTP/HTTPS, WebSocket
Speed: Slower, more CPU intensive
Features: Content-based routing, SSL termination
Can inspect: Application data

Example configuration:

location /api {
    proxy_pass http://api-servers;
}
location /images {
    proxy_pass http://static-servers;
}

Session Persistence (Sticky Sessions)

Methods:

Cookie-Based: LB inserts session cookie
IP-Based: Client IP determines server
Application Cookie: Uses existing app cookie
SSL Session ID: Uses SSL session for persistence

Pros and Cons:

✅ Maintains user session state
✅ Simplifies application design
❌ Uneven load distribution
❌ Complications with server failures
❌ Limits horizontal scaling benefits

Health Checks

Active Health Checks

health_check:
  interval: 10s
  timeout: 3s
  unhealthy_threshold: 3
  healthy_threshold: 2
  path: /health
  expected_status: 200

Passive Health Checks

Monitors real traffic responses
Marks servers down based on error rates
No additional health check traffic

Health Check Types:

TCP Check: Port connectivity
HTTP/HTTPS Check: Response code validation
Custom Script: Application-specific checks
Database Check: Backend connectivity

Load Balancer Deployment Patterns

1. Single Load Balancer

Internet → LB → [Server1, Server2, Server3]

Simple but single point of failure

2. Active-Passive HA

Internet → Active LB (Primary)
        ↘ Passive LB (Standby) → [Servers]

Failover capability
Resource underutilization

3. Active-Active HA

Internet → DNS Round Robin
        ↙         ↘
    LB1            LB2
        ↘         ↙
    [Server Pool]

No single point of failure
Better resource utilization

4. Global Server Load Balancing (GSLB)

User → DNS → Nearest Data Center
           ↙     ↓     ↘
         US-East  EU  Asia-Pacific

Geographic distribution
Disaster recovery
Reduced latency

Advanced Features

SSL/TLS Termination

Decrypt HTTPS at load balancer
Reduces backend server CPU load
Centralized certificate management
Internal traffic can use HTTP

Connection Multiplexing

Reuses backend connections
Reduces connection overhead
Improves performance

Request Routing

# Content-based routing
location ~ \.(jpg|png|gif)$ {
    proxy_pass http://static-servers;
}

# Header-based routing
if ($http_user_agent ~* mobile) {
    proxy_pass http://mobile-servers;
}

Rate Limiting

limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
location /api {
    limit_req zone=api burst=20;
}

Caching

Static content caching
API response caching
Reduces backend load

Best Practices

High Availability
- Deploy multiple load balancers
- Use different availability zones
- Implement health checks
Security
- Enable SSL/TLS termination
- Implement DDoS protection
- Use Web Application Firewall (WAF)
- Restrict backend access
Performance
- Enable connection pooling
- Configure appropriate timeouts
- Use caching where possible
- Monitor and tune algorithms
Monitoring
- Track response times
- Monitor error rates
- Alert on unhealthy instances
- Log analysis
Scaling
- Auto-scale based on metrics
- Plan for traffic spikes
- Test failover scenarios
- Document capacity limits

Common Issues and Solutions

1. Uneven Load Distribution

Cause: Sticky sessions, poor algorithm choice
Solution: Review algorithm, implement connection limits

2. Cascading Failures

Cause: Aggressive health checks, retry storms
Solution: Circuit breakers, exponential backoff

3. SSL Certificate Issues

Cause: Expired certs, misconfiguration
Solution: Automated renewal, monitoring

4. Connection Exhaustion

Cause: Keep-alive issues, connection leaks
Solution: Connection pooling, timeout tuning

Load Balancer vs Reverse Proxy

Load balancers distribute load across servers
Reverse proxies can do load balancing plus caching, compression, SSL
All load balancers are reverse proxies, not vice versa

Load Balancer vs API Gateway

API Gateway: API management, auth, rate limiting, transformation
Load Balancer: Traffic distribution, health checks
API Gateways often include load balancing

Interview Questions

Q: How do you handle session state with load balancing? A: Sticky sessions, external session storage (Redis), or stateless design
Q: What's the difference between Layer 4 and Layer 7 load balancing? A: L4 works with IP/port, faster; L7 inspects application data, more features
Q: How do you achieve zero-downtime deployments? A: Rolling updates, blue-green deployments, or canary releases with health checks
Q: How do you prevent a load balancer from becoming a bottleneck? A: Multiple load balancers, horizontal scaling, DNS round-robin, anycast IPs
Q: What happens when a server fails health checks? A: Marked unhealthy, removed from rotation, traffic redistributed, alerts triggered

What is Load Balancing?​

Why Load Balancing?​

Problems It Solves:​

Benefits:​

Types of Load Balancers​

1. Hardware Load Balancers​

2. Software Load Balancers​

3. Cloud Load Balancers​

Load Balancing Algorithms​

1. Round Robin​

2. Weighted Round Robin​

3. Least Connections​

4. Weighted Least Connections​

5. IP Hash (Source IP Affinity)​

6. Least Response Time​

7. Random​

8. Resource-Based​

OSI Layer Classification​

Layer 4 (Transport Layer) Load Balancing​

Layer 7 (Application Layer) Load Balancing​

Session Persistence (Sticky Sessions)​

Methods:​

Pros and Cons:​

Health Checks​

Active Health Checks​

Passive Health Checks​

Health Check Types:​

Load Balancer Deployment Patterns​

1. Single Load Balancer​

2. Active-Passive HA​

3. Active-Active HA​

4. Global Server Load Balancing (GSLB)​

Advanced Features​

SSL/TLS Termination​

Connection Multiplexing​

Request Routing​

Rate Limiting​

Caching​

Popular Load Balancing Solutions​

Open Source​

Cloud Services​

Best Practices​

Common Issues and Solutions​

1. Uneven Load Distribution​

2. Cascading Failures​

3. SSL Certificate Issues​

4. Connection Exhaustion​

Load Balancing vs Related Concepts​

Load Balancer vs Reverse Proxy​

Load Balancer vs API Gateway​

Interview Questions​