2026-03-26-tool-call-success-tracking
Tool Call Success Tracking
Date: 2026-03-26 Phase: 077-tool-call-success-tracking Author: Claude Code (phase-build)
What Shipped
First-class tool-call success/failure tracking per model. When a request includes tools, BrainstormRouter now detects whether the model produced valid tool calls (success), ignored them (failure), or hallucinated tool names (error). Metrics are stored in usage event metadata and surfaced in /v1/intelligence/rankings.
Why It Matters
This is a signal no other gateway tracks: "does this model actually follow tool-calling instructions?" Models with low tool-call compliance get deprioritized for tool-heavy tasks, and agents get visibility into which models are reliable tool callers.
Changes
| Surface | Files | Status |
|---|---|---|
| API | src/api/routes/completions.ts, src/api/routes/intelligence.ts, src/api/server.ts | Implemented |
| Intelligence | src/router/intelligence/tool-call-tracker.ts | New module |
| Types | src/router/intelligence/model-intelligence-types.ts | Extended |
| Tests | src/router/intelligence/tool-call-tracker.test.ts | 16 tests |
| Docs | This file | Implemented |
| SDK-TS | N/A | No new endpoints |
| SDK-PY | N/A | No new endpoints |
| MCP | N/A | No new agent-facing endpoints |
| CLI | N/A | No user-facing commands |
| Dashboard | N/A | Future phase |
Detection Logic
- Success: Request includes
tools, response hastool_callswith valid names - Failure: Request includes
tools, response has notool_calls(model ignored tools) - Hallucinated: Request includes
tools, response hastool_callswith names not in the provided tools array - None: No
toolsin request (not tracked)
Rankings Enhancement
GET /v1/intelligence/rankings now includes per-model toolCallMetrics:
{
"toolCallMetrics": {
"success_rate": 0.97,
"total_tool_requests": 4521,
"hallucinated_tools": 12,
"ignored_tools": 134
}
}
Lockstep Checklist
- [x] API routes modified (completions + intelligence)
- [x] No new endpoints (SDK lockstep N/A)
- [x] Tests written and passing
- [x] Ship log written
- [x]
pnpm buildpasses - [x]
pnpm checkpasses