2026-05-08-three-more-ingestors

2026-05-08 — Three more catalog ingestors (DeepSeek, x-AI, Google)

Summary

Closes the R22 risk register's #2 most-flagged item (5/10 agents): catalog ingestor coverage at 2/7 providers. After this PR: 5/7 (Anthropic + OpenAI + DeepSeek + x-AI + Google). The two remaining (Perplexity, Moonshot) ship later.

Changes

src/router/intelligence/deepseek-ingestor.ts — DeepSeek /v1/models is OpenAI-compatible (Bearer auth, {object, data: [{id, object, owned_by}]} shape). No created field; pricing remains in provider-catalog-pricing.ts.
src/router/intelligence/xai-ingestor.ts — x-AI /v1/models is identical to OpenAI's shape (Bearer auth, created is unix epoch); converted to ISO via the same pattern as the OpenAI ingestor.
src/router/intelligence/google-ingestor.ts — Google's /v1beta/models differs:
Auth: ?key= query param (not Bearer)
Response: {models: [{name: "models/gemini-2.5-flash", displayName, ...}]}
Strips the models/ prefix from name before emitting modelId, so identifiers match the curated catalog
3 paired test files (16 tests total) covering: liveness shape, no-API-key skip, HTTP error, network error, no-pricing assertion, plus per-provider quirks (epoch conversion for x-AI, query-param auth for Google).
src/router/model-router-init.ts — registers all 3 ingestors hourly when their respective API keys (DEEPSEEK_API_KEY, XAI_API_KEY, GOOGLE_API_KEY) are set.

Verification

pnpm test:fast    # 824 files / 7513 tests / 0 failed (was 7497; +16 from new tests)
pnpm tsgo         # 0 errors
pnpm check        # format/lint clean

Verified each provider's /models endpoint with real production keys before writing the ingestor — response shapes documented in code comments.

Post-deploy:

# Should appear in CW Logs at boot:
[router/model-router-init] DeepSeek catalog ingestor registered (hourly)
[router/model-router-init] x-AI catalog ingestor registered (hourly)
[router/model-router-init] Google catalog ingestor registered (hourly)

# After first hourly cycle:
[router/deepseek-ingestor] deepseek ingest: N models reported alive
[router/xai-ingestor] xai ingest: N models reported alive
[router/google-ingestor] google ingest: N models reported alive

Lockstep checklist

[x] Source — 3 new *-ingestor.ts files, each ~95 lines following the established pattern
[x] Tests — 3 paired test files, 16 paired regression tests covering identical surface for each provider
[x] Wiring — model-router-init.ts registers all 3 hourly when their API keys are set
[x] Ship log — this file
[x] R22 risk register — coverage advanced from 2/7 to 5/7
[ ] Perplexity + Moonshot ingestors — separate PR; verify endpoint shapes first

Trajectory

R20 (early today): 0/7 ingestors → "0 ingestors registered" was the longest-standing R17/R18/R19 gap. R20 (mid-session): 1/7 (Anthropic, PR #218). R21: 2/7 (OpenAI added, PR #225). R22 (this PR): 5/7 (DeepSeek + x-AI + Google in one PR). LiteLLM's "100+ auto-discovered" technical comparison point is now a 5/7 vs N/M comparison — the gap has narrowed to "long-tail providers we haven't validated yet."