When distributing traffic across multiple servers or regions, use this skill to select and configure the appropriate load balancing solution (L4/L7, cloud-managed, self-managed, or Kubernetes ingress) with proper health checks and session management.
Installation
Details
Usage
After installing, this skill will be available to your AI coding assistant.
Verify installation:
npx agent-skills-cli listSkill Instructions
name: load-balancing-patterns description: When distributing traffic across multiple servers or regions, use this skill to select and configure the appropriate load balancing solution (L4/L7, cloud-managed, self-managed, or Kubernetes ingress) with proper health checks and session management.
Load Balancing Patterns
Distribute traffic across infrastructure using the appropriate load balancing approach, from simple round-robin to global multi-region failover.
When to Use This Skill
Use load-balancing-patterns when:
- Distributing traffic across multiple application servers
- Implementing high availability and failover
- Routing traffic based on URLs, headers, or geographic location
- Managing session persistence across stateless backends
- Deploying applications to Kubernetes clusters
- Configuring global traffic management across regions
- Implementing zero-downtime deployments (blue-green, canary)
- Selecting between cloud-managed and self-managed load balancers
Core Load Balancing Concepts
Layer 4 vs Layer 7
Layer 4 (L4) - Transport Layer:
- Routes based on IP address and port (TCP/UDP packets)
- No application data inspection, lower latency, higher throughput
- Protocol agnostic, preserves client IP addresses
- Use for: Database connections, video streaming, gaming, financial transactions, non-HTTP protocols
Layer 7 (L7) - Application Layer:
- Routes based on HTTP URLs, headers, cookies, request body
- Full application data visibility, SSL/TLS termination, caching, WAF integration
- Content-based routing capabilities
- Use for: Web applications, REST APIs, microservices, GraphQL endpoints, complex routing logic
For detailed comparison including performance benchmarks and hybrid approaches, see references/l4-vs-l7-comparison.md.
Load Balancing Algorithms
| Algorithm | Distribution Method | Use Case |
|---|---|---|
| Round Robin | Sequential | Stateless, similar servers |
| Weighted Round Robin | Capacity-based | Different server specs |
| Least Connections | Fewest active connections | Long-lived connections |
| Least Response Time | Fastest server | Performance-sensitive |
| IP Hash | Client IP-based | Session persistence |
| Resource-Based | CPU/memory metrics | Varying workloads |
Health Check Types
Shallow (Liveness): Is the process alive?
- Endpoint:
/health/liveor/live - Returns: 200 if process running
- Use for: Process monitoring, container health
Deep (Readiness): Can the service handle requests?
- Endpoint:
/health/readyor/ready - Validates: Database, cache, external API connectivity
- Use for: Load balancer routing decisions
Health Check Hysteresis: Different thresholds for marking up vs down to prevent flapping
- Example: 3 failures to mark down, 2 successes to mark up
For complete health check implementation patterns, see references/health-check-strategies.md.
Cloud Load Balancers
AWS Load Balancing
Application Load Balancer (ALB) - Layer 7:
- Use for: HTTP/HTTPS applications, microservices, WebSocket
- Features: Path/host/header routing, AWS WAF integration, Lambda targets
- Choose when: Content-based routing needed
Network Load Balancer (NLB) - Layer 4:
- Use for: Ultra-low latency (<1ms), TCP/UDP, static IPs, millions RPS
- Features: Preserves source IP, TLS termination
- Choose when: Non-HTTP protocols, performance critical
Global Accelerator - Layer 4 Global:
- Use for: Multi-region applications, global users, DDoS protection
- Features: Anycast IPs, automatic regional failover
GCP Load Balancing
Application LB (L7): Global HTTPS LB, Cloud CDN integration, Cloud Armor (WAF/DDoS) Network LB (L4): Regional TCP/UDP, pass-through balancing, session affinity Cloud Load Balancing: Single anycast IP, global distribution, backend buckets
Azure Load Balancing
Application Gateway (L7): WAF integration, URL-based routing, SSL termination, autoscaling Load Balancer (L4): Basic and Standard SKUs, health probes, HA ports Traffic Manager (Global): DNS-based routing (priority, weighted, performance, geographic)
For complete cloud provider configurations and Terraform examples, see references/cloud-load-balancers.md.
Self-Managed Load Balancers
NGINX
Best for: General-purpose HTTP/HTTPS load balancing, web application stacks
Capabilities:
- HTTP reverse proxy with multiple algorithms
- TCP/UDP stream load balancing
- SSL/TLS termination
- Passive health checks (open source), active health checks (NGINX Plus)
- Cookie-based sticky sessions (NGINX Plus)
Basic configuration:
upstream backend {
least_conn;
server backend1.example.com:8080 weight=3;
server backend2.example.com:8080 weight=2;
keepalive 32;
}
server {
listen 80;
location / {
proxy_pass http://backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
For complete NGINX patterns and advanced configurations, see references/nginx-patterns.md.
HAProxy
Best for: Maximum performance, database load balancing, resource efficiency
Capabilities:
- Highest raw throughput, lowest memory footprint
- 10+ load balancing algorithms
- Sophisticated health checks (HTTP, TCP, Redis, MySQL, etc.)
- Cookie or IP-based persistence
Basic configuration:
frontend http_front
bind *:80
default_backend web_servers
backend web_servers
balance roundrobin
option httpchk GET /health
server web1 192.168.1.101:8080 check
server web2 192.168.1.102:8080 check
For complete HAProxy patterns, see references/haproxy-patterns.md.
Envoy
Best for: Microservices, Kubernetes, service mesh integration
Capabilities:
- Cloud-native design with dynamic configuration (xDS APIs)
- Circuit breakers, retries, timeouts
- Advanced health checks (TCP, HTTP, gRPC)
- Excellent observability
For complete Envoy patterns, see references/envoy-patterns.md.
Traefik
Best for: Docker/Kubernetes environments, dynamic configuration, ease of use
Capabilities:
- Automatic service discovery
- Native Kubernetes integration
- Built-in Let's Encrypt support
- Middleware system (auth, rate limiting)
For complete Traefik patterns, see references/traefik-patterns.md.
Kubernetes Ingress Controllers
Selection Guide
| Controller | Best For | Strengths |
|---|---|---|
| NGINX Ingress (F5) | General purpose | Stability, wide adoption, mature features |
| Traefik | Dynamic environments | Easy configuration, service discovery |
| HAProxy Ingress | High performance | Advanced L7 routing, reliability |
| Envoy (Contour/Gateway) | Service mesh | Rich L7 features, extensibility |
| Kong | API-heavy apps | JWT auth, rate limiting, plugins |
| Cloud Provider | Single-cloud | Native cloud integration |
Basic Ingress Example
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/affinity: "cookie"
spec:
ingressClassName: nginx
tls:
- hosts:
- app.example.com
secretName: app-tls
rules:
- host: app.example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port:
number: 80
- path: /
pathType: Prefix
backend:
service:
name: web-service
port:
number: 80
For complete Kubernetes ingress examples and Gateway API patterns, see references/kubernetes-ingress.md.
Session Persistence
Sticky Sessions (Use Sparingly)
Cookie-Based: Load balancer sets cookie to track server affinity
- Accurate routing, works with NAT/proxies
- HTTP only, adds cookie overhead
IP Hash: Hash client IP to select backend server
- No cookie required, works for non-HTTP
- Poor distribution with NAT/proxies
Drawbacks: Uneven load distribution, session lost on server failure, complicates scaling
Shared Session Store (Recommended)
Architecture: Stateless application servers + centralized session storage (Redis, Memcached)
Benefits:
- No sticky sessions needed
- True load balancing
- Server failures don't lose sessions
- Horizontal scaling trivial
Client-Side Tokens (Best for APIs)
JWT (JSON Web Tokens): Server generates signed token, client stores and sends with requests
Benefits:
- Fully stateless servers
- Perfect load balancing
- No session storage needed
For complete session management patterns and code examples, see references/session-persistence.md.
Global Load Balancing
GeoDNS Routing
Route users to nearest server based on geographic location:
- DNS returns different IPs based on client location
- Reduces latency, supports compliance and regional content
- Implementation: AWS Route 53, GCP Cloud DNS, Azure Traffic Manager
Multi-Region Failover
Primary/secondary region configuration:
- Health checks determine primary region health
- Automatic DNS failover to secondary
- Transparent to clients
CDN Integration
Combine load balancing with CDN:
- GeoDNS routes to closest CDN PoP
- CDN caches content globally
- Origin load balancing for cache misses
For complete global load balancing examples with Terraform, see references/global-load-balancing.md.
Decision Frameworks
L4 vs L7 Selection
Choose L4 when:
- Protocol is TCP/UDP (not HTTP)
- Ultra-low latency critical (<1ms)
- High throughput required (millions RPS)
- Client source IP preservation needed
Choose L7 when:
- Protocol is HTTP/HTTPS
- Content-based routing needed (URL, headers)
- SSL termination required
- WAF integration needed
- Microservices architecture
Cloud vs Self-Managed
Choose Cloud-Managed when:
- Single cloud deployment
- Auto-scaling required
- Team lacks load balancer expertise
- Managed service preferred
Choose Self-Managed when:
- Multi-cloud or hybrid deployment
- Advanced routing requirements
- Cost optimization important
- Full control needed
- Vendor lock-in avoidance
Self-Managed Selection
- NGINX: General-purpose, web stacks, HTTP/3 support
- HAProxy: Maximum performance, database LB, lowest resource usage
- Envoy: Microservices, service mesh, dynamic configuration
- Traefik: Docker/Kubernetes, automatic discovery, easy configuration
Configuration Examples
Complete working examples available in examples/ directory:
Cloud Providers:
examples/aws/alb-terraform.tf- AWS ALB with path-based routingexamples/aws/nlb-terraform.tf- AWS NLB for TCP load balancing
Self-Managed:
examples/nginx/http-load-balancing.conf- NGINX HTTP reverse proxyexamples/haproxy/http-lb.cfg- HAProxy configurationexamples/envoy/basic-lb.yaml- Envoy cluster configurationexamples/traefik/kubernetes-ingress.yaml- Traefik IngressRoute
Kubernetes:
examples/kubernetes/nginx-ingress.yaml- NGINX Ingress with TLSexamples/kubernetes/traefik-ingress.yaml- Traefik IngressRouteexamples/kubernetes/gateway-api.yaml- Gateway API configuration
Monitoring and Observability
Key Metrics
Throughput: Requests per second, bytes transferred, connection rate Latency: Request duration (p50, p95, p99), backend response time, SSL handshake time Errors: HTTP error rates (4xx, 5xx), backend connection failures, health check failures Resource Utilization: CPU, memory, active connections, connection queue depth Health: Healthy/unhealthy backend count, health check success rate
Load Balancer Logs
Enable access logs for request/response details, client IPs, response times, error tracking
- AWS ALB: Store in S3, analyze with Athena
- NGINX: Custom log format, ship to centralized logging
- HAProxy: Syslog integration, structured logging
Troubleshooting
Uneven Load Distribution
Symptoms: One server receives disproportionate traffic Causes: Sticky sessions with few clients, IP hash with NAT concentration, long-lived connections Solutions: Switch to least connections, disable sticky sessions, implement connection draining
Health Check Flapping
Symptoms: Servers rapidly transition between healthy/unhealthy Causes: Health check timeout too short, threshold too low, network instability Solutions: Increase interval and timeout, implement hysteresis, use deep health checks
Session Loss After Failover
Symptoms: Users logged out when server fails Causes: Sticky sessions without replication, in-memory sessions Solutions: Implement shared session store (Redis), use client-side tokens (JWT)
Integration Points
Related Skills:
infrastructure-as-code- Deploy load balancers via Terraform/Pulumikubernetes-operations- Ingress controllers for K8s traffic managementnetwork-architecture- Network design and topology for load balancingdeploying-applications- Blue-green and canary deployments via load balancersobservability- Load balancer metrics, access logs, distributed tracingsecurity-hardening- WAF integration, rate limiting, DDoS protectionservice-mesh- Envoy as both ingress and service mesh proxyimplementing-tls- TLS termination and certificate management
Quick Reference
Selection Matrix
| Use Case | Recommended Solution |
|---|---|
| HTTP web app (AWS) | ALB |
| Non-HTTP protocol (AWS) | NLB |
| Kubernetes HTTP ingress | NGINX Ingress or Traefik |
| Maximum performance | HAProxy |
| Service mesh | Envoy |
| Docker Swarm | Traefik |
| Multi-cloud portable | NGINX or HAProxy |
| Global distribution | CloudFlare, AWS Global Accelerator |
Algorithm Selection
| Traffic Pattern | Algorithm |
|---|---|
| Stateless, similar servers | Round Robin |
| Stateless, different capacity | Weighted Round Robin |
| Long-lived connections | Least Connections |
| Performance-sensitive | Least Response Time |
| Session persistence needed | IP Hash or Cookie |
| Varying server load | Resource-Based |
Health Check Configuration
| Service Type | Check Type | Interval | Timeout |
|---|---|---|---|
| Web app | HTTP /health | 10s | 3s |
| API | HTTP /health/ready | 10s | 5s |
| Database | TCP connect | 5s | 2s |
| Critical service | HTTP deep check | 5s | 3s |
| Background worker | HTTP /live | 30s | 5s |
Summary
Load balancing is essential for distributing traffic, ensuring high availability, and enabling horizontal scaling. Choose L4 for raw performance and non-HTTP protocols, L7 for intelligent content-based routing. Prefer cloud-managed load balancers for simplicity and auto-scaling, self-managed for multi-cloud portability and advanced features. Implement proper health checks with hysteresis, avoid sticky sessions when possible, and monitor key metrics continuously.
For deployment patterns, see examples in examples/aws/, examples/nginx/, examples/kubernetes/, and other provider directories.
More by NeverSight
View allInstall ADBC (Arrow Database Connectivity) drivers with dbc. Use when the user wants to install database drivers and connect to databases.
Stripe Checkout for one-time payments with Express.js. Auto-creates products if not configured.
Validates skills in this repo against agentskills.io spec and Claude Code best practices. Use via /validate-skills command.
This skill should be used when the user asks about "Unity performance", "optimization", "GC allocation", "object pooling", "caching", "Update loop optimization", "memory management", "profiling", "framerate", "garbage collection", or needs guidance on performance best practices for Unity games.
