CAF: Cryptographic Agent Framework with 5-Minute Certificates

2026-02-20

caf-mtlscaf-meshspiffeanomaly-kill-switch

LOCKSTEP TRACEABILITY MATRIX --- api_endpoints: ["POST /v1/agent/auth/cert", "POST /v1/agent/auth/revoke"] sdk_methods_updated: ["none (agent-level crypto, not developer SDK surface)"] mcp_tools_updated: ["none"] ---

What We Built

The Cryptographic Agent Framework (CAF) gives every AI agent a cryptographic identity. Agents authenticate via mTLS (mutual TLS) with short-lived certificates — 5-minute lifetime by design — issued by BrainstormRouter's internal Certificate Authority. The CA key and signed certificates are stored in AWS Secrets Manager, and the entire mTLS handshake runs through ALB passthrough mode (the ALB forwards raw client certificates in X-Amzn-Mtls-Clientcert rather than terminating the mutual auth).

CAF Phase 1 establishes the identity foundation: SPIFFE-compatible IDs (spiffe://brainstorm.internal/agent//), RSA certificate signing, RBAC-gated CSR exchange, and Agent JWT authentication for the certificate issuance endpoint.

CAF Phase 2 adds the enforcement layer: an Anomaly Kill Switch that can revoke a compromised agent's certificate, take it offline in the service registry, transition its lifecycle to "quarantined" via ARM, capture a forensic snapshot, and broadcast a cross-instance notification via Redis pub/sub — all as a fire-and-forget operation that never blocks the hot path.

Why It Matters

When your AI agents have production access to databases, APIs, and customer data, "trust but verify" is not enough. CISOs need cryptographic proof of agent identity, short-lived credentials that limit blast radius, and a kill switch for compromised agents that operates in seconds, not hours.

BrainstormRouter is the only AI gateway that provides SPIFFE-grade agent identity with automated certificate lifecycle. No other platform gives you 5-minute certificate rotation, RBAC-enforced CSR exchange, and a multi-step kill switch that simultaneously revokes the cert, quarantines the agent, captures forensics, and notifies all gateway instances.

How It Works

Certificate Issuance Flow:

Agent authenticates with a JWT (claims: iss=brainstormrouter, aud=brainstormrouter-api, sub=, tid=)
Agent generates RSA keypair locally, submits CSR to POST /v1/agent/auth/cert
RBAC check: agent must have cert.issue permission
CA validates CSR (rejects non-RSA keys), signs with 5-minute expiry
Signed certificate returned; agent uses it for mTLS on subsequent requests

Kill Switch Pipeline (5 steps, fire-and-forget):

1. Cert Revocation    → Revoke certificate by serial number
2. Registry Offline   → Remove from AgentServiceRegistry
3. Lifecycle → ARM    → Transition profile to "quarantined"
4. Forensic Snapshot  → Capture payload, violated rules, blast radius
5. Redis Broadcast    → Publish to caf:kill-switch channel

Infrastructure:

CA key stored in AWS Secrets Manager (brainstorm-router/production/caf-ca-bundle)
ALB configured for mutual auth passthrough (client cert forwarded as header)
IAM task role: SecretsManagerCAF (Get/Create/Put) + CloudWatchMetrics (PutMetricData)
Redis pub/sub for cross-instance certificate revocation propagation

The Numbers

5-minute certificate lifetime — short enough to limit blast radius, long enough for multi-turn agent conversations
5-step kill switch — cert revocation + registry offline + lifecycle quarantine + forensic snapshot + Redis broadcast
RSA-only certificates — CA rejects EC key CSRs (OID validation)
AWS Secrets Manager for CA storage — no local key material on gateway instances
Sub-second kill switch execution — fire-and-forget, non-blocking to hot path

Competitive Edge

No competing AI gateway provides cryptographic agent identity. Portkey uses API keys. OpenRouter uses API keys. Letta has no multi-agent security model. BrainstormRouter is the only platform where an agent's identity is cryptographically provable, automatically rotated, and revocable in seconds. This is the CISO story: "Every agent has an identity. Every identity has an expiry. Every anomaly has a kill switch."

Lockstep Checklist

[x] API Routes: /v1/agent/auth/cert (issue) and /v1/agent/auth/revoke (revoke) implemented with RBAC.
[x] TS SDK: Agent-level crypto is not exposed through the developer SDK (agents use JWT + CSR directly).
[x] Python SDK: Same — agent-level crypto, not developer SDK surface.
[ ] MCP Schemas: Not applicable.
[x] Master Record: Listed under "CAF Phases 1 & 2" in master-capability-record.md.