Budget middleware: 30s tenant-limits cache + Redis pipeline collapse
2026-05-07
What We Built
(1) getTenantBudgetLimits opened a full DB transaction on every budgeted non-exempt request, even when no limits were configured. Added a 30s in-process Map cache mirroring tenant-config.ts settingsCache pattern. (2) recordSpend issued 6 sequential Upstash round-trips. Collapsed into a single pipeline.
Why It Matters
Together, these were the auth-side bottleneck behind PR #198's 99.45% load-test failure rate at 50 RPS. With pool=50 and 2 ECS tasks, sustainable throughput was capped at ~25 RPS before the pool emptied. Pipelining saves ~60-90ms of serial latency on every completion's accounting path.
How It Works
Cache invalidation propagated to routes/budget.ts so dashboard writes aren't masked by the 30s TTL. Regression tests assert DB store hit count drops to 1 across N requests within TTL, and that pipeline is invoked with the full op set.
Lockstep Checklist
- [x] No API route changes (middleware/internal — lockstep N/A)
- [x] No SDK changes
- [x] No MCP tool changes
- [x] Regression test included (test-first invariant per /quality-fleet protocol)
- [x] Linked to /quality-fleet R1 dashboard at
.quality/dashboard.md
Provenance
Auto-found by /quality-fleet R1 (2026-05-07) scanner round, fixed in fix-agent batch under "go for all of it" autonomy grant. PR #203 merged to main as commit c95d10b94. Finding(s) tracked at .quality/findings.jsonl (entries: "e8b3d4c9a627", "b7d4e1c83a92"). Production-deployed via ECS task-def revision 732 series.