Cost Prediction — Estimate Request Cost Before Execution

2026-03-26

routerapimcpsdk

What We Built

POST /v1/estimate accepts a standard OpenAI-compatible chat completion payload and returns predicted cost per model before execution. No provider API calls are made — estimation uses the model registry pricing data and the same token heuristics as Guardian middleware.

If a model is specified, returns the estimate for that model plus the top 3 cheapest alternatives. If model is omitted or "auto", returns the top 5 cheapest models. Each estimate includes input/output token counts, cost in USD, context window, utilization percentage, and pricing breakdown.

Why It Matters

Before sending a 50K context request with 15 tools to an expensive model, operators and agents can now ask "what will this cost?" and compare alternatives. This enables budget-aware model selection at the application layer.

Lockstep Checklist

[x] API Routes: src/api/routes/estimate.ts — new POST endpoint
[x] TS SDK: packages/sdk-ts/src/resources/estimate.ts — client.estimate.cost()
[x] Python SDK: packages/sdk-py/.../estimate.py — sync + async
[x] MCP Schemas: br_estimate_cost tool in manifest, server, and agents.json
[ ] Master Record: Update when record exists