BrainstormRouter vs OpenRouter
Commodity pipe vs learning system: price-weighted random routing vs Bayesian optimization.
What OpenRouter does well
OpenRouter is the largest AI model aggregator:
- 400+ models from dozens of providers
- Simple pricing: 5% markup on provider costs (or $0 for some models)
- Unified API: one endpoint, any model
- Large community: widely adopted, good documentation
- Prompt caching: transparent caching for supported providers
OpenRouter's strength is accessibility. If you want one API key for every model, OpenRouter makes it trivial.
Where the architectures diverge
OpenRouter is a commodity pipe — it normalizes provider APIs and adds a markup. The routing is price-weighted: cheapest available endpoint wins. BrainstormRouter is a learning system — it uses the data flowing through it to optimize for quality, cost, and reliability simultaneously.
| Dimension | OpenRouter | BrainstormRouter |
|---|---|---|
| Routing algorithm | Price-weighted random / manual | Thompson Sampling (Bayesian learning) |
| Quality signal | None (pass-through) | Validity scoring + quality EWMA per model |
| Cost control | After-the-fact billing | Pre-request cost prediction + seatbelts |
| Memory | None | Persistent pgvector memory (tenant-wide) |
| Agents | None | Built-in Pi Runner with tool use + streaming |
| Governance | None | Guardian middleware with privacy modes |
| Response metadata | Model used, token counts | Cost, efficiency, cache hit, Guardian status |
| Circuit breakers | Provider-level failover | Per-endpoint dual-trigger state machine |
| Multi-tenancy | None (single API key) | Full RLS, AsyncLocalStorage, ltree hierarchy |
| Semantic cache | Prompt caching (provider) | Vector similarity cache (request-level) |
| Self-hosted | No | Open source |
The fundamental gap: feedback loops
OpenRouter processes 10,000 requests the same way it processes the first. The 10,001st request has no advantage from the previous 10,000.
BrainstormRouter accumulates three kinds of intelligence from every request:
1. Routing intelligence (Thompson Sampling)
Every response updates the posterior distribution for the selected model. From src/router/model-bandit.ts:60-71:
const bandit = new ModelBandit({
minValidity: 0.5, // Never route below 50% success rate
minQuality: 0.3, // Never route below 30% quality
initialC: 1.5, // Explore aggressively early
minC: 0.5, // Mild exploration at steady state
thompsonThreshold: 500, // Switch from UCB1 to Thompson at 500 samples
});
After 500 requests, the router knows which model performs best for your specific workload — not generically, not by price, but by measured reward.
2. Cost intelligence (Guardian EWMA)
Guardian tracks output/input token ratios per tenant. From src/api/middleware/guardian.ts:59-85:
const EWMA_ALPHA = 0.1;
existing.ratio = EWMA_ALPHA * observed + (1 - EWMA_ALPHA) * existing.ratio;
After ~20 requests, Guardian can predict your cost before the provider call. This powers the cost seatbelt — reject a $5 request before it happens, not after.
3. Knowledge intelligence (Memory extraction)
Useful context from completions is extracted asynchronously and stored in pgvector. From src/db/stores/memory-extraction-store.ts:69-84:
SELECT id FROM memory_extraction_queue
WHERE status = 'pending' AND next_run_at <= $1
ORDER BY next_run_at ASC, created_at ASC
LIMIT $2
FOR UPDATE SKIP LOCKED
OpenRouter doesn't know what you talked about yesterday. BrainstormRouter remembers.
Code proof: response headers
OpenRouter response
x-request-id: gen-abc123
BrainstormRouter response
X-BR-Guardian-Status: on
X-BR-Estimated-Cost: $0.02
X-BR-Actual-Cost: $0.018
X-BR-Efficiency: 0.91
X-BR-Guardian-Overhead-Ms: 0.8
Every response tells you what it cost, how efficient it was, and how much overhead the intelligence layer added. No separate dashboard or API call needed.
The governance gap
OpenRouter is a shared multi-tenant pipe. Every user's traffic flows through the same infrastructure with the same (lack of) security controls:
| Security capability | OpenRouter | BrainstormRouter |
|---|---|---|
| PII scanning | None | Built-in regex (email, phone, SSN, CC, IP) + pluggable backends |
| Outbound stream scanning | None | Token-by-token StreamingGuardrailEvaluator |
| Stream severing | None | truncated = true kills stream before PII reaches client |
| Tenant isolation | None | PostgreSQL RLS with SET LOCAL transaction scoping |
| Egress allowlist | None | Per-service domain restrictions with wildcard support |
| Governance enforcement | None | Deterministic keyword matching on streaming chunks |
| SIEM export | None | CEF + ECS JSON structured security events |
| PII air gap | None | scrubAndTokenize() / rehydrate() with in-memory token map |
OpenRouter's architecture makes these features structurally impossible to add without a redesign. It's a shared pipe — there's no tenant context to scope policies against, no streaming interception layer to attach guardrails to, and no structured event pipeline to export violations from.
For developer prototyping, this doesn't matter. For enterprise production workloads with compliance requirements, it's disqualifying.
When to choose OpenRouter
- You need access to 400+ models from one endpoint
- Your routing needs are simple (cheapest or specific model)
- You don't need memory, governance, or quality scoring
- You want the largest community and ecosystem
- You don't need multi-tenancy
When to choose BrainstormRouter
- You want routing that improves with usage
- You need per-response cost and efficiency data
- You need persistent memory across requests
- You need multi-tenant isolation for your customers
- You want pre-request cost prediction
- You need circuit breakers with exponential backoff
- You prefer open source with self-hosting