Circuit breakers, bulkheads, and resilience patterns for AI workloads
March 10, 2026
AI providers go down. Models get overloaded. Rate limits hit. If your application calls providers directly, every outage becomes your outage.
ModelRoute builds resilience into the infrastructure layer so your application doesn't have to.
Circuit breakers
A circuit breaker monitors the failure rate of a provider. When failures exceed a configurable threshold, the circuit "opens" and stops sending requests to that provider. After a cooldown period, the circuit enters "half-open" state and sends a test request. If it succeeds, the circuit closes and normal traffic resumes.
Our circuit breakers are Redis-backed with Lua scripts for atomic state transitions. They fail open — if Redis is unavailable, requests proceed normally rather than blocking.
Configuration per provider: - Failure threshold (number of failures to trip) - Success threshold (number of successes to close) - Open duration (cooldown before half-open)
Per-provider bulkheads
A bulkhead limits the number of concurrent requests to a specific provider. This prevents a slow or failing provider from consuming all your execution capacity.
Each provider has a configurable capacity (weighted semaphore). When the capacity is full, new requests for that provider are queued or rejected — but other providers continue operating normally.
Webhook retry with backoff
When we deliver results via webhook, we use exponential backoff: - Attempt 1: Immediate - Attempt 2: 1 minute - Attempt 3: 5 minutes - Attempt 4: 15 minutes - Attempt 5: 30 minutes
All webhooks are HMAC-SHA256 signed so you can verify authenticity.
Config-driven resilience
All resilience parameters — circuit breaker thresholds, bulkhead capacities, retry policies — are configured via YAML files stored in GCS. Changes are picked up via ETag-based polling (60-second interval) without requiring a redeploy.
This means we can tune resilience parameters for individual providers in production without downtime.
The result
Your application makes a single POST request. Behind the scenes, ModelRoute handles provider selection, circuit breaking, bulkhead isolation, retries, and webhook delivery. If a provider fails, your app never knows — the request is routed elsewhere automatically.
Infrastructure problems are our job. Your job is building the product.