Firewall

Tool call interception, masking, substitution, and downgrade controls.

Tool Call Firewall

The tool-call firewall inspects LLM tool calls before they execute. It runs a 9-step pipeline (deny list, substitution, RBAC scope check on effective tool name, argument parsing, command injection, prompt injection, semantic policy, secret redaction, mask patterns) and can block, warn, mask, substitute, or downgrade tool calls. Note: RBAC authorization is evaluated on the post-substitution tool name, so a principal needs permission for the replacement tool, not the original.

Actions

ActionBehavior
passTool call proceeds unmodified
warnTool call proceeds, warning headers set
maskRegex-matched content replaced with [REDACTED:{name}], call proceeds
substituteTool name replaced, args merged from substitution config, call proceeds
downgradeCall proceeds with downgradeHint for the router to use a cheaper model
blockTool call rejected, content_filter finish reason returned

Priority: block > downgrade > substitute > mask > warn > pass

Endpoints

List Interceptions

GET /v1/firewall/interceptions?limit=50&offset=0

Returns recent tool firewall violation events from the security event log.

Response:

{
  "interceptions": [
    {
      "id": "evt_abc123",
      "severity": "high",
      "actorId": "key-1",
      "actorType": "api_key",
      "resource": "/v1/chat/completions",
      "details": {
        "action": "block",
        "verdicts": [{ "toolName": "exec", "action": "block", "reasons": ["denied_tool: exec"] }]
      },
      "createdAt": "2026-03-11T10:00:00Z"
    }
  ],
  "total": 42,
  "limit": 50,
  "offset": 0
}

Dashboard Bridge

GET /auth/firewall/interceptions?limit=50&offset=0

Same data, accessible via Supabase JWT for the dashboard.

Configuration

Firewall config is set per-request via x_firewall_mode body parameter or per-tenant via settings:

{
  "mode": "block",
  "denyList": ["dangerous_tool"],
  "maskPatterns": [{ "name": "ssn", "pattern": "\\b\\d{3}-\\d{2}-\\d{4}\\b" }],
  "substitutions": [{ "from": "risky_exec", "to": "safe_exec", "mergeArgs": { "sandbox": true } }],
  "downgradeOnTools": ["expensive_analysis"]
}

SDK

TypeScript

const interceptions = await client.security.listInterceptions({ limit: 20 });

Python

interceptions = client.security.list_interceptions(limit=20)