Circuit Breaker Pattern Explained: Resilience in Distributed Systems
Think of the circuit breaker in your home's electrical panel: when excessive current flows, the circuit trips, preventing your entire electrical system from burning. The Circuit Breaker Pattern does exactly the same thing in software.
The Problem: Cascading Failures
In microservices architecture, services depend on each other. Service A calls Service B, which calls Service C. What happens if Service C crashes?
User → Service A → Service B → Service C (DOWN!)
↓ ↓
waiting... waiting...
↓ ↓
timeout! timeout!
↓
ENTIRE SYSTEM CRASHES
This nightmare scenario is called a cascading failure, described in detail in Michael Nygard's seminal book "Release It!"
The Solution: Circuit Breaker
The Circuit Breaker automatically cuts off calls to a failing service, protecting the rest of the system.
Three States
- CLOSED — Normal operation. Requests flow to the service.
- OPEN — Circuit is tripped. Requests immediately return an error without reaching the service.
- HALF-OPEN — Test mode. A few requests are allowed through. If successful, returns to CLOSED.
success
CLOSED ────────→ CLOSED
│ ↑
│ failure │ test succeeds
│ threshold │
↓ │
OPEN ──(timer)──→ HALF-OPEN
↑ │
│ test fails │
└─────────────────┘
How It Works
1. Failure Counter
The Circuit Breaker counts failed calls. When a threshold is exceeded (e.g., 50% error rate or 5 consecutive failures), the circuit opens.
2. Timeout Window
After the circuit opens, all requests are blocked for a specified duration (e.g., 30 seconds). This gives the failing service time to recover.
3. Health Check
After the timeout expires, the circuit enters HALF-OPEN state. A few test requests are sent. If they succeed, the circuit closes; if they fail, it opens again.
Implementation Example
class CircuitBreaker {
constructor(threshold = 5, timeout = 30000) {
this.failureCount = 0;
this.threshold = threshold;
this.timeout = timeout;
this.state = 'CLOSED';
this.lastFailure = null;
}
async call(serviceCall) {
if (this.state === 'OPEN') {
if (Date.now() - this.lastFailure > this.timeout) {
this.state = 'HALF_OPEN';
} else {
throw new Error('Circuit is OPEN');
}
}
try {
const result = await serviceCall();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
throw error;
}
}
onSuccess() {
this.failureCount = 0;
this.state = 'CLOSED';
}
onFailure() {
this.failureCount++;
this.lastFailure = Date.now();
if (this.failureCount >= this.threshold) {
this.state = 'OPEN';
}
}
}
Fallback Strategies
What to do when the circuit is open matters too:
- Return default values — Serve cached or stale data
- Alternative service — Route to a backup service
- Graceful degradation — Temporarily disable the feature
- Queue for later — Add the request to a retry queue
Real-World Usage
Netflix Hystrix
Netflix pioneered the Circuit Breaker in production. Their Hystrix library wraps every microservice call in a circuit breaker, ensuring 99.99% system availability.
BilgeOne Experience
At BilgeOne, we use Circuit Breakers when integrating with external payment and SMS services. When the payment service goes down, only the payment feature is disabled — the rest of the platform continues working.
Related Patterns
| Pattern | Purpose | |---------|---------| | Circuit Breaker | Cut off failing service calls | | Retry Pattern | Retry on transient failures | | Bulkhead Pattern | Isolate resources | | Timeout Pattern | Set maximum wait time |
These patterns together form resilience engineering.
Conclusion
The Circuit Breaker is a life-saving pattern in distributed systems. Despite being a simple concept, proper implementation dramatically improves system reliability.
Explore this and other resilience patterns in the Software Architecture 3.0 book. Practice interactively on LabLudus in the DevOps career path.