Agent Session Loop — the 'run' verb ships with multi-provider per-turn routing

2026-04-12

model-routeragent-profilesmcp-gatewayguardrailsbudget-enforcementsession-store

What We Built

BrainstormRouter now has an agent loop — POST /v1/agents/:agentId/run. Send a message and agent ID; the system creates a session, runs a multi-turn execution loop (LLM call → tool execution → LLM call → ... → done), and streams events back as SSE.

Each turn composes existing BR subsystems without rebuilding any of them:

Thompson sampling picks the best model per turn (can choose Claude for turn 1, Gemini for turn 2)
Context injection refreshes every turn — SOUL, MEMORY, SKILL, HEARTBEAT, WORKSPACE (the Kairos pattern: memory changes mid-run are visible on the next turn)
Guardrails scan every response before it's returned
MCP tool execution goes through the full governance pipeline (RBAC, tool firewall, approval queue)
Budget enforcement checks per-agent limits after each turn, hard-stops on exhaustion

The loop is an AsyncGenerator that yields typed events: agent.message, agent.tool_use, agent.tool_result, session.status_*, plus BR-specific governance events (br.model_selected, br.guardrail_intervention, br.budget_warning, br.routing_savings).

Why It Matters

Before this, BrainstormRouter handled individual API completions. Operators who wanted an agent to do multi-step work had to build their own loop on top of the /v1/chat/completions endpoint — calling BR repeatedly, parsing tool calls, executing them, and feeding results back. That's the same work every agent framework does.

Now, POST /v1/agents/:agentId/run does it all in one call. The operator defines the agent (SOUL, budget, tools, guardrails), sends a task, and watches the agent work. BrainstormRouter handles model selection, tool governance, budget enforcement, and context refresh automatically.

This is the "nervous system" layer: any product built on top of BR can define agents and run them without building a loop, a sandbox, or a governance stack.

How It Works

curl -X POST https://api.brainstormrouter.com/v1/agents/research-bot/run \
  -H "Authorization: Bearer brk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Research our top 3 competitors and write a summary",
    "max_turns": 15,
    "budget_cap_usd": 2.00
  }' \
  --no-buffer

SSE stream:

data: {}

event: br.model_selected
data: {"model":"gemini-2.5-flash","provider":"google","strategy":"thompson","reason":"highest reward","turnIndex":0}

event: agent.tool_use
data: {"toolName":"web_search","args":{"query":"Portkey AI gateway"},"turnIndex":0}

event: agent.tool_result
data: {"toolName":"web_search","result":"...","isError":false,"latencyMs":312,"turnIndex":0}

event: br.model_selected
data: {"model":"claude-sonnet-4-6","provider":"anthropic","strategy":"thompson","reason":"quality tier for synthesis","turnIndex":1}

event: agent.message
data: {"text":"Here's the competitive analysis...","model":"claude-sonnet-4-6","provider":"anthropic","turnIndex":1}

event: session.status_idle
data: {}

The Numbers

7 commits across 4 phases in one session
5 new API endpoints + 3 new MCP tools (107 total)
7 unit tests covering turn counting, budget enforcement, tool execution, error handling
0 new database tables (sessions + session_messages + usage_events, metadata in JSONB)
Per-turn overhead: context injection + guardrail scan ≈ 5-15ms on top of LLM latency

Competitive Edge

No other multi-provider gateway has a built-in agent loop. Anthropic's Managed Agents is Claude-only with no per-agent budgets. Portkey, OpenRouter, and Vercel AI Gateway are pass-through — they don't own the loop. LiteLLM doesn't have agent identity, memory, or tool governance. BrainstormRouter is the only platform where Thompson sampling can pick a different model every turn, per-agent budgets enforce in real-time, and every tool call goes through a 7-stage inspection pipeline.

Lockstep Checklist

[x] API Routes: src/api/routes/agent-runs.ts — 5 endpoints
[x] TS SDK: packages/sdk-ts/src/resources/agent-runs.ts — AgentRuns resource + domain registration
[x] Python SDK: packages/sdk-py/src/brainstormrouter/resources/agent_runs.py — sync + async
[x] MCP Schemas: 3 tools in src/mcp/tool-manifest.ts + handlers in src/mcp/handlers/agents.ts
[x] Ship Log: This entry