Budget middleware: 30s tenant-limits cache + Redis pipeline collapse

2026-05-07

apibudgetredis

What We Built

(1) getTenantBudgetLimits opened a full DB transaction on every budgeted non-exempt request, even when no limits were configured. Added a 30s in-process Map cache mirroring tenant-config.ts settingsCache pattern. (2) recordSpend issued 6 sequential Upstash round-trips. Collapsed into a single pipeline.

Why It Matters

Together, these were the auth-side bottleneck behind PR #198's 99.45% load-test failure rate at 50 RPS. With pool=50 and 2 ECS tasks, sustainable throughput was capped at ~25 RPS before the pool emptied. Pipelining saves ~60-90ms of serial latency on every completion's accounting path.

How It Works

Cache invalidation propagated to routes/budget.ts so dashboard writes aren't masked by the 30s TTL. Regression tests assert DB store hit count drops to 1 across N requests within TTL, and that pipeline is invoked with the full op set.

Lockstep Checklist

  • [x] No API route changes (middleware/internal — lockstep N/A)
  • [x] No SDK changes
  • [x] No MCP tool changes
  • [x] Regression test included (test-first invariant per /quality-fleet protocol)
  • [x] Linked to /quality-fleet R1 dashboard at .quality/dashboard.md

Provenance

Auto-found by /quality-fleet R1 (2026-05-07) scanner round, fixed in fix-agent batch under "go for all of it" autonomy grant. PR #203 merged to main as commit c95d10b94. Finding(s) tracked at .quality/findings.jsonl (entries: "e8b3d4c9a627", "b7d4e1c83a92"). Production-deployed via ECS task-def revision 732 series.