2026-05-08-cache-prefix-backfill-v51

2026-05-08 — v51 migration: backfill cache-prefix nested rows in usage_events

Summary

Closes the R22 risk register's #1 most-flagged item (8/10 agents): historical cache-prefix nested rows still in savings ledger (no backfill). PR #224 closed the WRITE path (cacheStore() rejects models that already start with cache/). This PR runs a one-shot backfill on the historical pollution.

R22 evidence (pre-backfill)

Production /v1/intelligence/savings byPath showed 28 entries with cache/cache/... prefixes, totaling 4,129 historical request rows with model values like:

cache/cache/x-ai/x-ai/grok-3                                  req=7
cache/cache/google/google/gemini-2.5-flash-lite               req=159
cache/cache/deepseek/deepseek-reasoner                        req=585
cache/cache/cache/cache/deepseek/deepseek/deepseek-chat       req=3
cache/cache/cache/cache/cache/cache/cache/cache/cache/cache/x-ai/x-ai/grok-3   req=2
... 23 more

(The double-x-ai/x-ai and google/google are SQL ${provider}/${model} concatenations in the savings tracker's display layer — not extra DB pollution.)

Migration v51

Single SQL statement runs at boot (DB_MIGRATE_ON_BOOT=1):

SET model = REGEXP_REPLACE(model, '^(cache/)+', '')
WHERE model LIKE 'cache/%'
RETURNING 1

The ^(cache/)+ regex strips ALL leading cache/ prefixes regardless of depth (1 to 10+). Idempotent — runs once because cacheStore() (PR #224) blocks the write path; future cycles are no-ops.

The migration logs the affected row count: Migration to v51 complete — stripped cache/ prefix from N rows.

Changes

  • src/db/migrate.tsCURRENT_VERSION 50 → 51; new v51 migration block.
  • src/db/migrate.test.ts — 1 new regression test pinning the regex strip semantics:
  • 1-deep, 10-deep, idempotent-on-clean, no-mid-string-match
  • The existing CURRENT_VERSION matches highest migration block regression test would have caught a forgot-to-bump.

Verification

pnpm test:fast    # 824 files / 7514 tests / 0 failed (was 7513; +1 from regex regression test)
pnpm tsgo         # 0 errors
pnpm check        # format/lint clean

Post-deploy:

# Migration log line at boot:
[migrate] Migration to v51 complete — stripped cache/ prefix from N rows

# /v1/intelligence/savings should show 0 entries with `cache/cache/` prefix:
curl -s https://api.brainstormrouter.com/v1/intelligence/savings \
  | jq '.byPath | map(select(.to | test("cache/cache"))) | length'
# Expected: 0

Risk assessment

  • Idempotent: re-running the regex on already-stripped values is a no-op (the regex matches only leading cache/).
  • Bounded: only ~4,129 rows match model LIKE 'cache/%'; affects the reporting byPath aggregation, NOT individual request paths or revenue.
  • Reversible: each affected row's prior value is encoded in audit_chain entries (per the audit_chain_consistent invariant).
  • No effect on revenue: cost_usd is independent of the model column; backfill only normalizes the routing identity used by the savings ledger and bandit reward signal.

Lockstep checklist

  • [x] Source — migrate.ts v51 block + CURRENT_VERSION bump
  • [x] Test — regex regression test pinning strip semantics
  • [x] Ship log — this file
  • [x] R22 risk register — 8/10 #1-flagged item closed at the data layer
  • [ ] Post-deploy verificationbyPath count of cache/cache entries should drop to 0