Trust Envelope (A-1) — JOSE-aligned EdDSA schema, signing, audit-only middleware

2026-05-09

securityapi

LOCKSTEP TRACEABILITY MATRIX --- api_endpoints: ["none — internal infra; A-3+ wires routing/budget/guardrails to read it"] sdk_methods_updated: ["none — envelope is internal until A-8 counterfactual replay"] mcp_tools_updated: ["none"] ---

What We Built

The first foundational PR of the Trust Envelope thread. Introduces a JOSE-aligned EdDSA JWT carrying the unified decision input that downstream routing, budget, and guardrail middleware will read once enforcement engages in A-3 / A-4 / A-5. A-1 is audit-only: the envelope is synthesized + signed + decoded onto the Hono context on every request, surfaced via X-BR-Envelope: audit response header, but no routing/budget/guardrail decision yet reads it.

Why It Matters

Pre-envelope, every privileged decision in the gateway reassembled its inputs from disparate sources — apiKey.tenantId, _principal.roles, _agentIdentity.agentId, _budgetReservation, _anomalyScore — each Hono context variable set by a different middleware, with no single signed object representing "what this request is allowed to do." The pattern was load-bearing AND invisible: two synthetic-ApiKey paths (mTLS at agent-jwt-auth.ts:231-242, mtls-auth.ts:261-272) made fake ResolvedApiKey objects with hardcoded budget caps that don't match the underlying tenant — and no type-level distinction between real and synthetic.

The trust envelope replaces this. One signed object, six claim groups (br_principal, br_budget, br_scope, br_trust, br_observability, br_test), Ed25519-signed for downstream verification without Redis/DB round-trip. By A-7, the envelope's br_trust.tier and br_trust.xdr_risk will be load-bearing routing inputs; by A-8, counterfactual replay will let auditors simulate "what would have happened with policy X" against historical envelopes.

A-1 ships ONLY the foundation: schema, signing, verifying, audit-only middleware. Production traffic is unaffected — envelopeSynthMode defaults to "off", the middleware no-ops, no envelope is set on the context. Operators flip the config flag to "audit-only" when ready to observe synthesis on real traffic.

How It Works

├── schema.ts          TrustEnvelope type + Zod validator
├── keys.ts            Ed25519 KeyBundle + Secrets Manager bootstrap
├── sign.ts            EdDSA SignJWT, 5-min TTL default
├── verify.ts          Key rotation; JWTExpired preserved (no downgrade oracle)
├── synth.ts           Hono context → envelope payload (reuse-disciplined)
├── compat.ts          ResolvedApiKey shim for legacy consumers
├── middleware.ts      Audit-only Hono middleware, fail-open on synth error
└── index.ts           Barrel export

Wiring: createApiApp accepts envelopeKeys + envelopeSynthMode

  • envelopeSynthDeps opts. The middleware is registered after

agentContextInjectorMiddleware and before guardrailsMiddleware, so auth has resolved before synthesis runs. Config flag lives at gateway.envelope.synth.mode per the existing zod-schema pattern.

Reuse discipline

Every claim mapped to an existing primitive imports/calls the canonical source rather than recomputing. Specifically:

  • ReputationTier imported from src/router/agent-reputation.ts

never duplicate the 5-tier enum (restricted/bronze/silver/gold/platinum).

  • Reputation tier resolved via ReputationEngine.getReputation(agentId).tier

never compute tier from score locally.

  • apiKey.budgetLimitUsd parsed directly; spent fetched via the

injected getBudgetSpent callback (the boot will wire to buildBudgetKey() from src/api/shared/budget-keys.ts).

  • Anomaly score injected via getAnomalyScore — A-3+ wiring plugs

in AnomalyDetector's composite scorer; A-1 defaults to 0.

  • Auth-method coercion centralized in mapAuthMethod() — single

place to update if upstream _authMethod strings ever change.

Key management

Production: loadEnvelopeSigningKey({ secretsManagerPrefix }) mirrors the CAF CA pattern at src/security/caf/ca.ts:80-255. First call bootstraps a fresh keypair into Secrets Manager; concurrent ECS task boots race-handle ResourceExistsException and re-read so all instances converge on a single key. Local dev path generates an ephemeral keypair in-process — never persisted, logged warning.

Rotation: EnvelopeKeyBundle carries current + optional previous. On verification, the current key is tried first; if it fails for any non-expiry reason and previous is present, the previous key is tried. Critical security property: a JWTExpired from current-key verification is preserved and re-thrown — we never silently fall back to previous key for expired tokens (closes the downgrade-oracle attack the codex review specifically called out).

Audit-only middleware semantics

  • mode="off" (default): middleware no-ops, sets

trustEnvelope=null and _trustEnvelopeDecoded=null to clear stale state from prior config flips, calls next().

  • mode="audit-only": synthesizes payload from the Hono context,

signs, decodes for local-use storage on c.set("trustEnvelope") and c.set("_trustEnvelopeDecoded"), emits X-BR-Envelope: audit response header so probes/agents observe the audit pass is live.

  • Fail-open on synth error: A synthesis or sign failure logs

[trust-envelope synth failed] with the cause and continues with trustEnvelope=null. A-1 MUST NOT take prod traffic down. A-3 will flip the failure mode to fail-closed when enforcement engages.

What This Doesn't Fix

  • Routing, budget, and guardrail middleware do NOT yet read the

envelope. That's A-3 / A-4 / A-5.

  • The synthetic-ApiKey pattern at agent-jwt-auth.ts:231-242 and

mtls-auth.ts:261-272 is unchanged. A-2 (envelope-synthesizer formalized) will move synth into buildAgentContext and start attaching mtls_fingerprint from the mTLS middleware.

  • Reputation successful_calls / failed_calls counters return 0 in

A-1 — wired in A-7 (reputation feeds envelope).

  • parent_chain is empty in A-1 — populated in A-6 (delegation

lineage).

  • SPIFFE URI format in synth.ts:buildSubjectId mirrors

caf/ca.ts:137 rather than importing a shared helper. A follow-up PR should extract formatSpiffeUri() into a shared module and call it from both places. Deferred to keep this PR's blast radius constrained.

Test Plan

Production-code unit tests across 7 files (~640 LOC, 60 tests):

  • schema.test.ts — Zod accepts well-formed envelopes; rejects

unknown tier, unknown auth_method, negative cap_usd, anomaly_score outside [0,1], missing iss; accepts wildcard * and explicit allowlists for models; accepts optional xdr_risk.

  • keys.test.tsgenerateKeyPair returns Ed25519 with unique

kid; sign/verify smoke proves public/private keys match; loadEnvelopeSigningKey returns ephemeral keypair when no prefix is configured (and never persists across calls).

  • sign.test.ts — Compact JWS with three segments; kid in

protected header; iat/exp/jti overwritten by signer; custom ttlSec honored.

  • verify.test.ts — Round-trip; tamper detection (bit flip in

payload segment); expired token rejected with JWTExpired (and never falls back to previous key on expiry); rotation falls back to previous when current rejects; fails when neither key accepts; performance smoke (100 sign+verify roundtrips < 1 sec).

  • synth.test.tsmapAuthMethod mappings (sso → supabase_jwt,

agent-jwt → agent_jwt, mtls → mtls, default → api_key); buildSubjectId SPIFFE for mtls/agent_jwt, user:{userId} for api_key, tenant:{tenantId} fallback; normalizeBudgetPeriod (daily → day, monthly → month); full synth for human + mTLS-agent callers; injected getBudgetSpent + getAnomalyScore deps thread through; throws when apiKey is missing; clamps non-finite cap to 0.

  • compat.test.tsenvelopeToApiKey projects to legacy shape;

sandbox tier → environment=test; explicit allowedModels preserved; wildcard models → null; budgetLimitUsd formatted as 2-decimal string; mtls:{agentId} / agent-jwt:{agentId} / envelope:{jti} id patterns; isSyntheticApiKey prefix detection.

  • middleware.test.tsmode="off" clears context, no header;

mode="audit-only" signs, sets context, emits header, decodes sub correctly; audit-only fails open when synth throws (apiKey missing); construction throws when audit-only mode given no keys.

Full unit suite: 7657 passed / 0 failed (839 test files, +7 from new module). pnpm tsgo exit 0. oxlint --type-aware 0 warnings, 0 errors. oxfmt --check clean.

Lockstep

  • TypeScript SDK: no change (envelope is internal infra)
  • Python SDK: no change
  • MCP tools: no change (no agent-facing surface — A-8 adds

/v1/envelope/replay)

  • API surface: no change (no new routes)
  • site/public/routes.json: unchanged
  • OpenAPI: unchanged

By design, A-1's lockstep matrix is empty — this is greenfield infrastructure that downstream PRs (A-3 onward) will expose. The empty matrix is honest, not an oversight.

Codex review focus

  • Turn 1 (design): synth.ts claim assembly correctness, especially

br_principal.user_id source (apiKey.userId vs \_principal.id); reuse discipline (every value derived from existing primitive vs recomputed locally); type safety; backward compatibility (no existing call sites broken).

  • Turn 2 (security): verify.ts key-rotation downgrade-oracle

check (JWTExpired from current-key NOT swallowed before previous-key attempt); audit-only fail-open semantics correctly preserve request flow; envelope token never leaks via header/log/error message; tenant isolation (every claim that includes a tenant ID is keyed on the apiKey's tenantId, never cross-tenant).