Auto Mode

How BrainstormRouter intelligently selects models for each request.

Overview

Set model: "auto" and BrainstormRouter automatically selects the best model for each request based on complexity, cost, and learned quality scores.

curl https://api.brainstormrouter.com/v1/chat/completions \
  -H "Authorization: Bearer br_live_..." \
  -H "Content-Type: application/json" \
  -d '{"model": "auto", "messages": [{"role": "user", "content": "What is 2+2?"}]}'

How It Works

Auto mode combines several intelligence systems:

1. Complexity Assessment

The auto-selector analyzes each request to estimate complexity:

Token count — longer prompts suggest more complex tasks
Tool presence — requests with tools need models that support function calling
System prompt complexity — detailed instructions suggest nuanced tasks
Message history depth — multi-turn conversations benefit from stronger models

2. Thompson Sampling (Bandit)

BrainstormRouter learns from every request. The Thompson sampling bandit tracks per-model success rates and explores new models intelligently:

Models that produce high-quality responses get selected more often
Models that fail or produce low-quality outputs are deprioritized
New models get exploration budget to gather initial data

3. Cost-Quality Frontier

The cost optimizer finds the Pareto-optimal tradeoff between price and quality. For simple prompts, it picks cheap models. For complex prompts, it picks capable ones.

4. Circuit Breaker Awareness

Models with open circuit breakers (recently failing) are excluded from auto selection. This prevents routing to unhealthy providers.

What Gets Selected

Request Type	Typical Selection
Simple factual question	Fast, cheap model (Haiku, GPT-4o-mini, Flash)
Code generation with tools	Capable model (Sonnet, GPT-4o)
Complex reasoning	Top-tier model (Opus, o1)
Research with citations	Perplexity Sonar Pro

Conversation Consistency

When using conversation_id, auto mode keeps the same model for the entire conversation. This prevents jarring style changes mid-conversation.

The model is locked on the first request in a conversation and reused for subsequent requests with the same conversation_id.

Tenant Aliases

You can define aliases to control what "auto" and other shorthand names resolve to for your tenant:

curl -X PUT https://api.brainstormrouter.com/v1/aliases \
  -H "Authorization: Bearer br_live_..." \
  -H "Content-Type: application/json" \
  -d '{"fast": "anthropic/claude-haiku-4-5", "smart": "anthropic/claude-sonnet-4-5"}'

Then use model: "fast" or model: "smart" in requests.

Opting Out

To bypass auto mode, specify a full model ID:

{ "model": "anthropic/claude-sonnet-4-5-20250514" }

Or use X-BR-Skip-Memory: true to prevent auto mode from using memory context for model selection.

Response Headers

Auto mode adds a X-BR-Route-Reason header explaining the selection:

X-BR-Route-Reason: auto:complexity-low
X-BR-Model: claude-haiku-4-5-20251001
X-BR-Provider: anthropic