Source linked

Traefik's Missing Redispatch Broke Our Zero-Downtime Deploys

statusdude.com@fast_fox1 hour ago·Systems Engineering·2 comments

StatusDude's engineers ditched Traefik after hitting routing race conditions and the inability to retry on a different backend. HAProxy's 'option redispatch' fixed it cleanly.

statusdudedocker composehaproxytraefikload balancingzero downtime deployment

Traefik has a known issue from 2018 that makes zero-downtime rolling deploys impossible: its retry middleware only retries on the same dying backend. StatusDude, a company serving thousands of monitoring checks per minute across multi-region workers, found that out the hard way during their first deploy with Traefik.

Three Ways Traefik Fails During a Rolling Deploy StatusDude's first approach used two Docker Compose services (backend_old and backend_new) with identical Traefik routing labels. Traefik's Docker provider threw "Service defined multiple times" and returned 404s on every request. No merge, no fallback. They reworked to docker compose --scale backend=4, scaling up old and new replicas then scaling down to just the new ones. That's when the second failure hit: Traefik's routing table didn't update fast enough when containers stopped. 502s on every other request as traffic hit containers already shutting down. Then the killer: a request reaches a dying container and the connection drops. Traefik's retry middleware retries on the same backend - the one that just failed. No attempt to dispatch to a healthy replica. This is (https://github.com/traefik/traefik/issues/2723), open for years.

HAProxy's option redispatch - The Simple Fix StatusDude ripped out Traefik and dropped in HAProxy. The key config is three directives: ```

retries 3
option redispatch
retry-on conn-failure empty-response response-timeout 502 503 504
``` option redispatch tells HAProxy to try a different backend server when a retryable error occurs. The retry-on list covers exactly the failure modes of a dying container. Combined with server-template and Docker DNS resolution (re-resolves every 2 seconds), HAProxy reliably skips containers that are shutting down.

What Zero-Downtime Actually Requires Three things, no Kubernetes required: multiple replicas via Docker Compose deploy.replicas, a load balancer that retries on a different backend, and a rolling update script. StatusDude's Docker Compose file defines two backend replicas with health checks. HAProxy health-checks each replica every second, marks it down on error, and option redispatch ensures in-flight failures get a fresh try on a healthy server. StatusDude deploys multiple times a day with zero dropped requests. No etcd, no pod specs, no orchestration overhead - just a clean HAProxy config and Docker Compose replicas.


Source: Zero-Downtime Deployments with Docker Compose - No Kubernetes Required
Domain: statusdude.com

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.