Real-time circuit breaker for AI sessions

Kill the one rogue
LLM session.
Not your whole app.

Turnstile is a transparent proxy that meters every AI session in real time and trips a circuit breaker the instant one session goes rogue — a runaway agent loop or a blown budget — without taking down the rest of your app.

Drop-in. Swap one base URL — no SDK, no code changes. Your provider key passes through and is never stored.

Live overviewLive
Dollars prevented
$13.60
Saved by killing runaway loops and budget overruns.
sess:research-loop-3
claude-3-5-sonnet
$6.755 blocked
sess:checkout-agent-1
gpt-4o
$12.400 blocked
sess:batch-summarizer
gpt-4o-mini
$3.201 blocked

Works with any OpenAI-compatible endpoint — direct provider adapters too

OpenAIAnthropicGoogle GeminiOpenRouter
The problem

Teams running AI agents in production face two problems with no good answer.

Costs spike with no warning

A prompt-injection loop, a retry storm, or a single greedy agent can burn through your month's budget in an afternoon. By the time the invoice lands, the money is gone.

No way to stop just one

Your only kill switch is the global API key. Revoke it and every session dies — every user, every agent, the whole app. There's no scalpel, only the plug.

One number that matters

Dollars prevented, not marketing math.

Turnstile surfaces a single hero metric: the estimated spend it blocked when a session crossed its budget ceiling or tripped loop detection — priced from that session's own recent cost-per-call. So the number reflects real averted spend, not a made-up multiplier.

  • Pre-execution enforcement — it stops the call before it runs, not after the bill arrives.
  • Priced from the session's actual recent cost-per-call.
  • Rolled up per org, per model, and across your whole account.
Turnstile organization overview — a live dollars-prevented hero with a per-session breakdown of model, cost, and blocks.
How it works

Point your base URL at Turnstile. The rest is automatic.

01

Swap one base URL

Point your LLM client at Turnstile instead of the provider. No SDK, no code changes. Your provider key rides along and is never stored.

02

Meter every session

The Go data plane resolves each request to a session and meters tokens and cost in real time — everything off the hot path.

03

Enforce before forwarding

A synchronous gate checks the budget ceiling, loop detection, and manual kill — then forwards to the provider and streams back byte-for-byte.

04

Watch it live

Hashed aggregates (never raw prompts or keys) post to the control plane and stream to your dashboard over WebSocket — sessions, spend, dollars prevented.

What you get

Built for the hot path of production AI.

Drop-in

Swap one base URL. Keep your provider API key — it's passed through, never stored.

Session-precise

Kills the one rogue trajectory, not the global API key. Every other session keeps flowing.

Real-time

Pre-execution enforcement — budget ceilings, loop detection, manual kill — not a retrospective dashboard.

Negligible overhead

Microsecond-scale added latency — measured, self-instrumented, with a test that fails if it regresses.

Fail-open

A fault in Turnstile forwards the request rather than breaking your app — configurable to fail-closed.

Private

Raw prompts and API keys are never persisted — only salted hashes and numeric aggregates leave the data plane.

One line

The only change is the base URL.

Keep your SDK, your model names, your API key. Turnstile speaks the same OpenAI-compatible protocol and forwards everything byte-for-byte — it just meters and enforces on the way through.

No SDKNo code rewriteKey passed throughStreaming preserved
# before — straight to the provider
client = OpenAI(
    base_url="https://api.openai.com/v1",
    api_key=os.environ["OPENAI_API_KEY"],
)

# after — through Turnstile (that's it)
client = OpenAI(
    base_url="https://turnstileguard.com/v1",
    api_key=os.environ["OPENAI_API_KEY"], # passed through
)
Architecture

Three planes. The hot path stays thin.

Your app points its LLM base URL at the Go data plane, which meters and enforces in real time and forwards to the provider. It posts hashed aggregates — never raw prompts or keys — to the FastAPI control plane, which owns Postgres and pushes live updates over Redis.

Turnstile architecture: your app → Go data plane (meter + enforce + forward) → provider; hashed aggregates → FastAPI control plane → Postgres + Redis → Next.js dashboard.

Go data plane

Hot path

The transparent proxy and circuit breaker. Meters, enforces, and forwards — Go 1.22, stdlib only.

FastAPI control plane

Source of truth

Multi-tenant API and the sole database owner. Upserts hashed aggregates into Postgres, publishes over Redis.

Next.js dashboard

Account UI

Live telemetry over REST + WebSocket. Sessions, spend, and dollars prevented, updating in real time.

Point one base URL.
Watch dollars prevented climb.

Open-source, MIT-licensed, and running end-to-end today. Stand it up locally in minutes — your app, through Turnstile, to any provider.