Auto Mode

How BrainstormRouter intelligently selects models for each request.

Overview

Set model: "auto" and BrainstormRouter automatically selects the best model for each request based on complexity, cost, and learned quality scores.

curl https://api.brainstormrouter.com/v1/chat/completions \
  -H "Authorization: Bearer br_live_..." \
  -H "Content-Type: application/json" \
  -d '{"model": "auto", "messages": [{"role": "user", "content": "What is 2+2?"}]}'

How It Works

Auto mode combines several intelligence systems:

1. Complexity Assessment

The auto-selector analyzes each request to estimate complexity:

  • Token count — longer prompts suggest more complex tasks
  • Tool presence — requests with tools need models that support function calling
  • System prompt complexity — detailed instructions suggest nuanced tasks
  • Message history depth — multi-turn conversations benefit from stronger models

2. Thompson Sampling (Bandit)

BrainstormRouter learns from every request. The Thompson sampling bandit tracks per-model success rates and explores new models intelligently:

  • Models that produce high-quality responses get selected more often
  • Models that fail or produce low-quality outputs are deprioritized
  • New models get exploration budget to gather initial data

3. Cost-Quality Frontier

The cost optimizer finds the Pareto-optimal tradeoff between price and quality. For simple prompts, it picks cheap models. For complex prompts, it picks capable ones.

4. Circuit Breaker Awareness

Models with open circuit breakers (recently failing) are excluded from auto selection. This prevents routing to unhealthy providers.

What Gets Selected

Request TypeTypical Selection
Simple factual questionFast, cheap model (Haiku, GPT-4o-mini, Flash)
Code generation with toolsCapable model (Sonnet, GPT-4o)
Complex reasoningTop-tier model (Opus, o1)
Research with citationsPerplexity Sonar Pro

Conversation Consistency

When using conversation_id, auto mode keeps the same model for the entire conversation. This prevents jarring style changes mid-conversation.

The model is locked on the first request in a conversation and reused for subsequent requests with the same conversation_id.

Tenant Aliases

You can define aliases to control what "auto" and other shorthand names resolve to for your tenant:

curl -X PUT https://api.brainstormrouter.com/v1/aliases \
  -H "Authorization: Bearer br_live_..." \
  -H "Content-Type: application/json" \
  -d '{"fast": "anthropic/claude-haiku-4-5", "smart": "anthropic/claude-sonnet-4-5"}'

Then use model: "fast" or model: "smart" in requests.

Opting Out

To bypass auto mode, specify a full model ID:

{ "model": "anthropic/claude-sonnet-4-5-20250514" }

Or use X-BR-Skip-Memory: true to prevent auto mode from using memory context for model selection.

Response Headers

Auto mode adds a X-BR-Route-Reason header explaining the selection:

X-BR-Route-Reason: auto:complexity-low
X-BR-Model: claude-haiku-4-5-20251001
X-BR-Provider: anthropic