Agent Session Loop — the 'run' verb ships with multi-provider per-turn routing

2026-04-12

model-routeragent-profilesmcp-gatewayguardrailsbudget-enforcementsession-store

What We Built

BrainstormRouter now has an agent loopPOST /v1/agents/:agentId/run. Send a message and agent ID; the system creates a session, runs a multi-turn execution loop (LLM call → tool execution → LLM call → ... → done), and streams events back as SSE.

Each turn composes existing BR subsystems without rebuilding any of them:

  • Thompson sampling picks the best model per turn (can choose Claude for turn 1, Gemini for turn 2)
  • Context injection refreshes every turn — SOUL, MEMORY, SKILL, HEARTBEAT, WORKSPACE (the Kairos pattern: memory changes mid-run are visible on the next turn)
  • Guardrails scan every response before it's returned
  • MCP tool execution goes through the full governance pipeline (RBAC, tool firewall, approval queue)
  • Budget enforcement checks per-agent limits after each turn, hard-stops on exhaustion

The loop is an AsyncGenerator that yields typed events: agent.message, agent.tool_use, agent.tool_result, session.status_*, plus BR-specific governance events (br.model_selected, br.guardrail_intervention, br.budget_warning, br.routing_savings).

Why It Matters

Before this, BrainstormRouter handled individual API completions. Operators who wanted an agent to do multi-step work had to build their own loop on top of the /v1/chat/completions endpoint — calling BR repeatedly, parsing tool calls, executing them, and feeding results back. That's the same work every agent framework does.

Now, POST /v1/agents/:agentId/run does it all in one call. The operator defines the agent (SOUL, budget, tools, guardrails), sends a task, and watches the agent work. BrainstormRouter handles model selection, tool governance, budget enforcement, and context refresh automatically.

This is the "nervous system" layer: any product built on top of BR can define agents and run them without building a loop, a sandbox, or a governance stack.

How It Works

curl -X POST https://api.brainstormrouter.com/v1/agents/research-bot/run \
  -H "Authorization: Bearer brk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Research our top 3 competitors and write a summary",
    "max_turns": 15,
    "budget_cap_usd": 2.00
  }' \
  --no-buffer

SSE stream:

data: {}

event: br.model_selected
data: {"model":"gemini-2.5-flash","provider":"google","strategy":"thompson","reason":"highest reward","turnIndex":0}

event: agent.tool_use
data: {"toolName":"web_search","args":{"query":"Portkey AI gateway"},"turnIndex":0}

event: agent.tool_result
data: {"toolName":"web_search","result":"...","isError":false,"latencyMs":312,"turnIndex":0}

event: br.model_selected
data: {"model":"claude-sonnet-4-6","provider":"anthropic","strategy":"thompson","reason":"quality tier for synthesis","turnIndex":1}

event: agent.message
data: {"text":"Here's the competitive analysis...","model":"claude-sonnet-4-6","provider":"anthropic","turnIndex":1}

event: session.status_idle
data: {}

The Numbers

  • 7 commits across 4 phases in one session
  • 5 new API endpoints + 3 new MCP tools (107 total)
  • 7 unit tests covering turn counting, budget enforcement, tool execution, error handling
  • 0 new database tables (sessions + session_messages + usage_events, metadata in JSONB)
  • Per-turn overhead: context injection + guardrail scan ≈ 5-15ms on top of LLM latency

Competitive Edge

No other multi-provider gateway has a built-in agent loop. Anthropic's Managed Agents is Claude-only with no per-agent budgets. Portkey, OpenRouter, and Vercel AI Gateway are pass-through — they don't own the loop. LiteLLM doesn't have agent identity, memory, or tool governance. BrainstormRouter is the only platform where Thompson sampling can pick a different model every turn, per-agent budgets enforce in real-time, and every tool call goes through a 7-stage inspection pipeline.

Lockstep Checklist

  • [x] API Routes: src/api/routes/agent-runs.ts — 5 endpoints
  • [x] TS SDK: packages/sdk-ts/src/resources/agent-runs.ts — AgentRuns resource + domain registration
  • [x] Python SDK: packages/sdk-py/src/brainstormrouter/resources/agent_runs.py — sync + async
  • [x] MCP Schemas: 3 tools in src/mcp/tool-manifest.ts + handlers in src/mcp/handlers/agents.ts
  • [x] Ship Log: This entry