BrainstormRouter vs OpenRouter

Commodity pipe vs learning system: price-weighted random routing vs Bayesian optimization.

What OpenRouter does well

OpenRouter is the largest AI model aggregator:

400+ models from dozens of providers
Simple pricing: 5% markup on provider costs (or $0 for some models)
Unified API: one endpoint, any model
Large community: widely adopted, good documentation
Prompt caching: transparent caching for supported providers

OpenRouter's strength is accessibility. If you want one API key for every model, OpenRouter makes it trivial.

Where the architectures diverge

OpenRouter is a commodity pipe — it normalizes provider APIs and adds a markup. The routing is price-weighted: cheapest available endpoint wins. BrainstormRouter is a learning system — it uses the data flowing through it to optimize for quality, cost, and reliability simultaneously.

Dimension	OpenRouter	BrainstormRouter
Routing algorithm	Price-weighted random / manual	Thompson Sampling (Bayesian learning)
Quality signal	None (pass-through)	Validity scoring + quality EWMA per model
Cost control	After-the-fact billing	Pre-request cost prediction + seatbelts
Memory	None	Persistent pgvector memory (tenant-wide)
Agents	None	Built-in Pi Runner with tool use + streaming
Governance	None	Guardian middleware with privacy modes
Response metadata	Model used, token counts	Cost, efficiency, cache hit, Guardian status
Circuit breakers	Provider-level failover	Per-endpoint dual-trigger state machine
Multi-tenancy	None (single API key)	Full RLS, AsyncLocalStorage, ltree hierarchy
Semantic cache	Prompt caching (provider)	Vector similarity cache (request-level)
Self-hosted	No	Open source

The fundamental gap: feedback loops

OpenRouter processes 10,000 requests the same way it processes the first. The 10,001st request has no advantage from the previous 10,000.

BrainstormRouter accumulates three kinds of intelligence from every request:

1. Routing intelligence (Thompson Sampling)

Every response updates the posterior distribution for the selected model. From src/router/model-bandit.ts:60-71:

const bandit = new ModelBandit({
  minValidity: 0.5, // Never route below 50% success rate
  minQuality: 0.3, // Never route below 30% quality
  initialC: 1.5, // Explore aggressively early
  minC: 0.5, // Mild exploration at steady state
  thompsonThreshold: 500, // Switch from UCB1 to Thompson at 500 samples
});

After 500 requests, the router knows which model performs best for your specific workload — not generically, not by price, but by measured reward.

2. Cost intelligence (Guardian EWMA)

Guardian tracks output/input token ratios per tenant. From src/api/middleware/guardian.ts:59-85:

const EWMA_ALPHA = 0.1;
existing.ratio = EWMA_ALPHA * observed + (1 - EWMA_ALPHA) * existing.ratio;

After ~20 requests, Guardian can predict your cost before the provider call. This powers the cost seatbelt — reject a $5 request before it happens, not after.

3. Knowledge intelligence (Memory extraction)

Useful context from completions is extracted asynchronously and stored in pgvector. From src/db/stores/memory-extraction-store.ts:69-84:

SELECT id FROM memory_extraction_queue
WHERE status = 'pending' AND next_run_at <= $1
ORDER BY next_run_at ASC, created_at ASC
LIMIT $2
FOR UPDATE SKIP LOCKED

OpenRouter doesn't know what you talked about yesterday. BrainstormRouter remembers.

Code proof: response headers

OpenRouter response

x-request-id: gen-abc123

BrainstormRouter response

X-BR-Guardian-Status: on
X-BR-Estimated-Cost: $0.02
X-BR-Actual-Cost: $0.018
X-BR-Efficiency: 0.91
X-BR-Guardian-Overhead-Ms: 0.8

Every response tells you what it cost, how efficient it was, and how much overhead the intelligence layer added. No separate dashboard or API call needed.

The governance gap

OpenRouter is a shared multi-tenant pipe. Every user's traffic flows through the same infrastructure with the same (lack of) security controls:

Security capability	OpenRouter	BrainstormRouter
PII scanning	None	Built-in regex (email, phone, SSN, CC, IP) + pluggable backends
Outbound stream scanning	None	Token-by-token `StreamingGuardrailEvaluator`
Stream severing	None	`truncated = true` kills stream before PII reaches client
Tenant isolation	None	PostgreSQL RLS with `SET LOCAL` transaction scoping
Egress allowlist	None	Per-service domain restrictions with wildcard support
Governance enforcement	None	Deterministic keyword matching on streaming chunks
SIEM export	None	CEF + ECS JSON structured security events
PII air gap	None	`scrubAndTokenize()` / `rehydrate()` with in-memory token map

OpenRouter's architecture makes these features structurally impossible to add without a redesign. It's a shared pipe — there's no tenant context to scope policies against, no streaming interception layer to attach guardrails to, and no structured event pipeline to export violations from.

For developer prototyping, this doesn't matter. For enterprise production workloads with compliance requirements, it's disqualifying.

When to choose OpenRouter

You need access to 400+ models from one endpoint
Your routing needs are simple (cheapest or specific model)
You don't need memory, governance, or quality scoring
You want the largest community and ecosystem
You don't need multi-tenancy

When to choose BrainstormRouter

You want routing that improves with usage
You need per-response cost and efficiency data
You need persistent memory across requests
You need multi-tenant isolation for your customers
You want pre-request cost prediction
You need circuit breakers with exponential backoff
You prefer open source with self-hosting