eBPF · enforce-mode validated · v0.3

Cut your LLM bill
without touching a line of agent code.

turbo-flow saves money three ways: caps runaway bursts at the kernel, rewrites opus→sonnet in-flight, and caches identical prompts across your whole fleet. Zero SDK changes.

0
SDK lines
<50ns
per SSL_write
6
runtimes
5%
vs Admin API
live · demo host$247.83saved last hr
cap$8.11
rewrote$188.02
cache$51.70
live · api.anthropic.com
saved this session
$0.00
capped
0
rewrote
0
cached
0
passed
0
burst / runaway rewrite lane cache lane other providers
// 3 WAYS IT SAVES YOU MONEY

Three levers. Stackable.
Real dollars, not just dashboards.

Turn them on individually or all at once. Every dollar surfaces as a Prometheus counter.
CAP01

Stop runaway bursts

before the bill arrives

Agent stuck in a retry loop? CI job that forgot to exit? The kernel drops its TCP packets the moment the budget is blown. You pay $0 for the rest of the burst.

before$1,820 surprise from one stuck agent
after$100 · budget capped at 100k tok/min
saved$1,720
DOWNGRADE02

Rewrite expensive models

opus → sonnet on the fly

Most agent tasks don't need Opus. --downgrade-to sonnet rewrites the model field in-flight (TLS-terminated proxy). Same API, same response shape, 5× cheaper per token.

before1M opus tok · $75.00
after1M sonnet tok · $15.00
saved$60 / M tok
CACHE + COALESCE03

Pay once per unique prompt

SHA-keyed cache + single-flight

100 CI workers fire the same prompt at the same second → 1 upstream call, 99 followers share the response. Identical repeat calls in the next 24h → zero upstream tokens.

before100 workers × $0.18 = $18.00
after1 upstream call · $0.18
saved$17.82 / swarm
All three surface as Prometheus counters:turbo_flow_proxy_rewrote_saved_usd_totalturbo_flow_proxy_cache_saved_usd_totalturbo_flow_proxy_coalesced_saved_usd_total
// WHY

Observability tells you the bill arrived.
We refuse to let it leave.

6 reasons · scroll to see →
01
Hard stop, not soft alert
TC egress drops packets before they leave the host. The request physically can't happen.
02
Zero app changes
eBPF uprobes on SSL_write. No SDK wrapper, no proxy URL, no rebuild.
03
🔒
Prompts stay local
Kernel-side inspection. --no-preview drops plaintext if compliance is watching.
04
$
Real savings
TLS-terminating proxy: model downgrade, SHA-cache, single-flight coalescing.
05
6 runtimes
Python · Node · Bun · Deno · Ruby · Go. One tool, every agent on the host.
06
±5% vs Admin API
Response-token reconciliation swaps estimates for ground truth on every read.
// HOW

Three probes. One bit flip.

eBPF stays tiny on purpose — one memcpy, one map lookup. All logic lives in user space.
01
INTERCEPT

eBPF uprobe on SSL_write

Every agent hitting libssl, BoringSSL, or crypto/tls triggers a uprobe. 384-byte preview + direction into a ring buffer.

~50ns · kernel-local
02
POLICE

User-space policy engine

Rust daemon drains the ring, classifies model tier, debits a shared rolling 60s budget. Response usage reconciles estimates.

SHA-hash retry dedup
03
ENFORCE

TC egress drops packets

Budget flipped? One bit flips in an eBPF map. Matching TCP packets return TC_ACT_SHOT. Well-behaved PIDs keep flowing.

port-scoped · zero userspace hot path
  [agent] → SSL_write → [uprobe] → ring-buf → [daemon] → { JSONL, Prom, ENFORCE_MAP }
                                                                      │
                                                                      ▼
                                                          [tc-egress] budget blown? SHOT.
// HANDS ON

Drop it into a running host.

No agent restart. No container rebuild. Start, it attaches. Stop, it detaches clean.
bash
# shadow — observe & cost-track
sudo turbo-flow start \
  --budget 100000 --iface eth0 \
  --metrics-port 9191

# enforce — hard cap
sudo turbo-flow start \
  --budget 100000 --iface eth0 --enforce \
  --target-port 443 \
  --alert-webhook https://hooks.slack.com/…
Shadow mode by default. Observe for a week, then flip --enforce.
// OBSERVE

Prometheus + Grafana.
Out of the box.

--metrics-port 9191 · import the JSON dashboard · done.
turbo-flow / overviewlast 1h · 10s · prod-eu
Total saved
$247.83
+$18.40/hr
Cache hit rate
38.2%
412 / 1,078
Budget util
62%
62000 / 100k
Active PIDs
14
1 blocked
savings_rate_usd_per_min
sum(rate(…_saved_usd_total[1m])) * 60
rewrotecachecoalesced
saved_usd_by_model
top · desc
modelrewrotecache
claude-opus-4$188.02$12.40
claude-sonnet-4-5$28.10
claude-haiku-4-5$11.20
gpt-4o$7.90
// VS.

Everyone else asks you to change the agent.

DIMENSIONTURBO-FLOW ★SDK WRAPPERAPP PROXYDASHBOARD
IntegrationZeroWrap every callChange every URLAdd middleware
EnforcementKernel drops packetsApp-level (bypassable)Proxy (bypassable)Alert only
Prompts leave hostNeverSometimesYesYes
RuntimesPy · Node · Go · Ruby+One per SDKAny (URL change)Any (SDK change)
Retry dedupSHA-hash at probe timeManualCounts as newPost-hoc
DeployOne binary + sudoRebuild agentsNetwork redesignInstrument all
// PROOF

This isn't a weekend project.

Things we test so you don't page at 3am.
sim

21 scenarios under sim-validate

8 agent personalities × fake Anthropic. Each asserts charges, retry dedup, direction flags.

CI

Lifecycle leak guard

Start/SIGTERM × 3 in Lima. Asserts no clsact dup, no stale links, no bpf pin growth.

ABI

Struct layout tests

#[repr(C)] shared between eBPF & userspace. Size drift fails the build before verifier.

policy

Response reconciliation

Uretprobe stashes buffer on entry, reads on return. usage block swaps estimate for truth.

tc

Port-scoped enforcement

Classifier drops only dport == --target-port. SSH, metrics scrape — untouched.

proxy

Canonical cache keys

JSON key order / whitespace / per-caller metadata normalized. SDK diversity stops fragmenting cache.

Apache-2.0Rust 2024eBPF CO-REPrometheus-nativeLima-testedno_std eBPFSSL_writeSSL_readtc-egressuretprobering-bufCO-RE
Apache-2.0Rust 2024eBPF CO-REPrometheus-nativeLima-testedno_std eBPFSSL_writeSSL_readtc-egressuretprobering-bufCO-RE
// SHIP IT

Install in 30 seconds.
Delete in one.

Apache-2.0 · Rust · self-hosted · any Linux 5.15+

requires: Linux 5.15+ · BTF-enabled · CAP_BPF or root