2026-03-26-budget-warnings-inband

Budget Warnings In-Band

Date: 2026-03-26 Phase: 076-budget-warnings-inband

What shipped

Proactive budget warnings injected directly into completion responses. Agents learn they're running low without polling GET /v1/budget/status.

Warning Thresholds

UtilizationLevelBehavior
80-89%warningHeader + SSE comment
90-94%criticalHeader + SSE comment
95-99%urgentHeader + SSE comment
100%exceededRequest blocked (existing)

Non-streaming

Response header: X-BR-Budget-Warning: {"level":"warning","utilization_pct":82,...}

Streaming

SSE comment (before [DONE]): : budget-warning {"level":"critical","utilization_pct":91,...}

Follows the same pattern as guardian metadata — SSE comments are silently discarded by standard EventSource parsers, only visible to clients that parse comment lines.

Lockstep checklist

  • [x] API: src/api/middleware/budget.ts, src/api/routes/completions.ts
  • [x] Tests: src/api/middleware/budget.test.ts (5 new tests)
  • [x] Docs: this ship log
  • [x] SDK-TS: N/A (no new endpoints)
  • [x] SDK-PY: N/A (no new endpoints)
  • [x] MCP: N/A (no new agent-facing endpoints)

Technical notes

  • Zero latency impact — utilization calculated from existing budget reserve response
  • Warning data attached to Hono context as _budgetWarning for downstream use
  • BudgetWarning type exported for type-safe consumption