Agent Memory (RMM)
Persistent, tenant-wide memory with 4-block core memory, archival storage, and keyword search.
Overview
BrainstormRouter provides tenant-wide persistent memory using a Relational Memory Model (RMM) inspired by the Letta architecture. Every agentic request shares the same memory — no conversation IDs or history arrays needed.
Memory is stored in per-tenant SQLite databases with two systems:
- Core memory: A small, always-loaded scratchpad of key facts organized into 4 blocks
- Archival memory: A vector-indexed long-term store backed by Postgres pgvector
Core memory blocks
Core memory is divided into four blocks, each with a default entry limit:
| Block | Default Limit | Purpose |
|---|---|---|
human | 15 | User preferences, identity, communication style |
system | 10 | Infrastructure, deployment, environment details |
project | 15 | Project name, tech stack, architecture decisions |
general | 15 | Anything else worth remembering |
Entries can be pinned to prevent eviction during memory compaction.
How agents use memory
When a request runs in agentic mode, The Soul has access to 7 built-in memory tools:
| Tool | Description |
|---|---|
core_memory_append | Add a fact to a specific block |
core_memory_replace | Update an existing fact by ID |
core_memory_delete | Remove a fact by ID |
core_memory_pin | Pin a fact (prevents eviction) |
core_memory_unpin | Unpin a previously pinned fact |
archival_memory_insert | Store text in the vector archive |
archival_memory_search | Semantic search across archived entries |
The agent decides when to read and write memory autonomously — these tools are called as part of the agentic response, not by your application code.
Tenant-wide scoping
Memory is scoped to your tenant, not to individual conversations or API keys:
# Request 1: teach it something
client.chat.completions.create(
model="anthropic/claude-sonnet-4-5",
messages=[{"role": "user", "content": "Our deploy target is us-east-1 on ECS Fargate."}],
extra_body={"mode": "agentic"},
)
# Request 2 (hours later): it remembers
client.chat.completions.create(
model="anthropic/claude-sonnet-4-5",
messages=[{"role": "user", "content": "Where do we deploy?"}],
extra_body={"mode": "agentic"},
)
# → "You deploy to us-east-1 on ECS Fargate."
Session isolation
When BR_MEMORY_SESSION_ISOLATION=1 is set, agent-JWT callers must include a sid (session ID) claim. Memory operations are then scoped to that session within the tenant. API-key callers are unaffected and always see the full tenant memory.
REST API
All endpoints require API key authentication (Authorization: Bearer br_live_...).
Blocks
| Method | Endpoint | Description |
|---|---|---|
| GET | /v1/memory/blocks | List all blocks with entry counts |
| GET | /v1/memory/blocks/:block | List entries in a specific block |
Entries
| Method | Endpoint | Description |
|---|---|---|
| GET | /v1/memory/entries | List all core memory entries |
| POST | /v1/memory/entries | Append a new entry (fact, block, pinned) |
| PUT | /v1/memory/entries/:id | Update entry text and/or pin status |
| DELETE | /v1/memory/entries/:id | Delete an entry |
Query
| Method | Endpoint | Description |
|---|---|---|
| POST | /v1/memory/query | Keyword search across entries |
Init
| Method | Endpoint | Description |
|---|---|---|
| POST | /v1/memory/init | Bootstrap memory from context docs |
SDK usage
TypeScript
import BrainstormRouter from "brainstormrouter";
const client = new BrainstormRouter({ apiKey: "br_live_..." });
// List blocks
const { blocks } = await client.memory.blocks();
// List all entries
const { entries } = await client.memory.entries();
// Append a fact
await client.memory.append("Deploy target is us-east-1 on ECS Fargate", {
block: "system",
});
// Update a fact
await client.memory.update("entry-id", {
fact: "Deploy target is us-west-2 on ECS Fargate",
});
// Pin an entry
await client.memory.update("entry-id", { pinned: true });
// Search by keyword
const { results } = await client.memory.query("deploy");
// results: [{ entry: { id, fact, block, pinned }, score: 0.85 }]
// Delete
await client.memory.remove("entry-id");
Python
from brainstormrouter import BrainstormRouter
client = BrainstormRouter(api_key="br_live_...")
# List blocks
blocks = client.memory.blocks()
# List all entries
entries = client.memory.entries()
# Append a fact
client.memory.append(fact="Deploy target is ECS Fargate", block="system")
# Update a fact
client.memory.update("entry-id", fact="Deploy target is us-west-2")
# Pin an entry
client.memory.update("entry-id", pinned=True)
# Search by keyword
results = client.memory.query("deploy")
# Delete
client.memory.remove("entry-id")
MCP tools
The MCP server exposes 5 memory tools for AI agents connecting via the Model Context Protocol:
| Tool | Permission | Description |
|---|---|---|
br_memory_list | memory.read | List all core memory entries |
br_memory_store | memory.write | Store a new memory entry |
br_memory_query | memory.read | Search entries by keyword relevance |
br_memory_delete | memory.write | Delete an entry by ID |
br_memory_update | memory.write | Update entry text and/or pin status |
Asynchronous extraction
Memory extraction doesn't happen inline — it would add too much latency. After a completion returns, the gateway enqueues an extraction job using a Postgres-backed durable job queue with FOR UPDATE SKIP LOCKED for concurrent worker processing. The response is returned immediately; background workers claim and process jobs asynchronously.
Storage and isolation
Core memory is stored in per-tenant SQLite databases at . Archival memory uses Postgres with pgvector for vector indexing.
Tenant isolation is enforced at the filesystem level (separate SQLite files per tenant) for core memory, and by Row-Level Security policies for archival memory in Postgres. There is no cross-tenant data access.
What's shipped vs. planned
Shipped:
- Core memory CRUD (append, replace, delete, pin/unpin) via API, SDKs, MCP, and dashboard
- Keyword search/query via API, SDKs, MCP, and dashboard
- 7 agent tools for autonomous memory management
- Memory init (bootstrap from context documents)
- Per-tenant SQLite isolation
- Session isolation flag (
BR_MEMORY_SESSION_ISOLATION) - Async extraction queue
Planned:
- Archival memory search as a REST endpoint (currently agent-only via tool)
- Cross-system unification (core RMM + Postgres chunks into a single query surface)
- Memory provenance and mutation audit trail