Forensics panel 500 — listForTenant unwrapped pg.QueryResult.rows
2026-05-06
LOCKSTEP TRACEABILITY MATRIX --- api_endpoints: ["GET /auth/mesh/forensics", "GET /v1/mesh/forensics"] sdk_methods_updated: ["none — internal store fix, route surface unchanged"] mcp_tools_updated: ["security.getForensicSnapshots (call path unchanged)"] ---
What We Built
Fixed ForensicSnapshotStore.listForTenant so it unwraps the pg.QueryResult shape returned by drizzle's tx.execute(). Every dashboard load of the EVIDENCE → Forensic Snapshots panel was 500'ing with TypeError: (intermediate value).map is not a function. The function was treating the QueryResult wrapper ({ rows, rowCount, command, oid, fields }) as if it were the rows array itself.
Why It Matters
This panel is the operator's "Proof of Death" view — the immutable record of what an agent was doing the moment the anomaly engine terminated it. With the panel returning 500, that audit surface was effectively offline: no security operator could see kill events through the dashboard at all.
How It Works
drizzle-orm/node-postgres tx.execute(sql) resolves to a raw pg.QueryResult, not the rows array. The codebase's other tx.execute callsites (src/db/stores/memory-store.ts, src/infra/jobs/content-cleanup.ts) all access .rows on the result. forensic-store.ts was the outlier — when the DISTINCT-ON refactor switched the query from tx.select() (which returns an array) to tx.execute() (which returns a wrapper), the existing cast rows as unknown as Array<...> slipped through type-checking but blew up at runtime.
// before
const rows = await withTenant(...);
const rawRows = rows as unknown as Array<Record<string, unknown>>;
rawRows.map(...) // TypeError
// after
const result = await withTenant(...);
const rawRows = ((result as { rows?: unknown }).rows ?? []) as Array<Record<string, unknown>>;
rawRows.map(...)
Added three regression tests with a stub db that mimics the real pg.QueryResult shape — these would have caught the bug before deploy.
The Numbers
- 100% of
/auth/mesh/forensicscalls were 500'ing pre-fix (verified via CW Logs filterpattern=forensicsover a 30-min window) - Failure latency: 8ms (synchronous TypeError, not a hung query)
- Tests added: 3 (rows-array unwrapping, empty result, post-DISTINCT-ON re-sort)
Lockstep Checklist
- [x] API Routes: No route surface change — internal store fix.
- [x] TS SDK: No SDK surface change.
- [x] Python SDK: No SDK surface change.
- [x] MCP Schemas: No tool surface change (security.getForensicSnapshots call path unchanged).
- [x] Master Record: No capability change.