Back to blog
Product Updates// Article

How we built a 99.9% uptime AI chat infrastructure

A deep-dive into our event-driven architecture on the edge.

SKSam Klein March 28, 2026 9 min read
How we built a 99.9% uptime AI chat infrastructure

When we started FunnelConvo, our infrastructure was a single Postgres instance, a Node.js monolith, and a lot of caffeine. Five years later, we serve 10 million conversations a month from 14 edge regions with 99.97% uptime. Here's how we got there.

Why edge

Latency kills conversation flow. If a bot takes more than 800ms to reply, visitors notice — and they bounce. We made a non-negotiable rule: every bot reply must start within 200ms, anywhere in the world. That's only possible at the edge.

We rebuilt our chat handler as a small TypeScript runtime that runs on Cloudflare Workers. Conversation state lives in Durable Objects, AI inference fans out to GPU clusters in Oregon, Frankfurt, and Singapore based on visitor geolocation.

Eventing, not polling

Our old inbox polled the database every 2 seconds for new messages. At 10M conversations a month, that's 5 billion queries per day. We replaced it with a tiny event bus built on NATS — one publish, many subscribers, zero polling.

Failure isolation

Every customer's chatbots run in their own Durable Object instance. If one instance crashes, only their bots are affected. The blast radius of any single failure is one tenant — not the platform.

If you want to see what this looks like in practice, check out our changelog — we ship updates to this infrastructure almost every week.

What's next

We're working on multi-region replication for Durable Objects so a single visitor can be served by the nearest healthy region even during a partial outage. Want to help build it? We're hiring platform engineers.

SK
Written by
Sam Klein
// Ready to talk?

Try FunnelConvo
in 5 minutes.