Memory Layer

The project brain for AI coding agents

Sprintra captures every Claude Code session on your machine and links it to the project work it produced — the sprints, decisions, stories, and team trail already in your workspace. Local-first capture, on-device semantic recall, no extra services. Zero added LLM cost.

That single design choice — memory that's wired into your project management substrate — is what separates Sprintra from a standalone memory plugin. The next session's briefing doesn't just remind you what you said; it shows you what you decided, what shipped, and what's still open.

What the memory layer does for you

Capability	How it shows up
Every prompt and tool call captured	Hooks Claude Code's UserPromptSubmit and PostToolUse events
Session summary written by the agent at Stop	No background compression service — the agent writes its own summary as the final tool call
Relevant past summaries surfaced on session start	On-device similarity search picks the top-N matches from your local cache
Linked to the work itself	Each session ties to the sprint, decisions, and stories it touched
Cost per tool call	$0 — capture and recall are free
Cloud storage / heavy user / year	~6 MB — tiny digests only; raw transcripts stay local
Multi-user privacy	Per-user scope by default; admin role required for cross-user audit
Offline recall	No network needed once the local cache is warm

Local-first capture, on-device recall

When a Claude Code session ends, the agent writes a short structured summary — what was discussed, key decisions, open questions, pushback, pending asks — into a small SQLite buffer in your home directory. The same buffer caches a similarity vector for that summary so the next session can pull the most relevant past summaries without a cloud round-trip.

This means three useful things:

Same-machine recall is instant. Top-matched past summaries are appended to the next session's briefing in milliseconds. No background worker, no separate index.
Recall works offline. Once the cache is warm, similarity search runs purely on-device.
Cross-machine handoff still works. A tiny digest syncs to Sprintra Cloud (default 30-day retention) so when you log in from a new laptop the briefing still has the prior context.

You can verify what's cached on this machine from the dashboard under Settings → Local memory cache: count of cached summaries, last cache update, and pinned items.

Memory linked to the work itself

Capture-only memory tools answer one question well: “what did we just discuss?” That's useful, but a coding session almost never lives by itself. It produces decisions, opens stories, lands commits, ships releases.

Sprintra stores those artifacts already, so each session digest gets cross-linked to them automatically. When the next session loads, the briefing surfaces:

Active sprint and its current goal
Decisions made in the last working session, with their rationale
Stories you completed and the ones still open
Pinned summaries that explain why things are the way they are
The top relevant past summaries pulled by on-device similarity

That's the project brain part of the positioning. Memory by itself drifts; memory wired into the project state stays anchored to what actually shipped.

3-tier retention model

What gets captured, where it's stored, and how long we keep it.

Tier	Stores	Retention	Sync
Cloud	Prompts + session summaries (digests)	30 days · pinned: forever	Yes (per-user scope)
Local	Cached embeddings + index over Claude Code's own session files	90 days (configurable)	Never
Project artifacts	Stories, decisions, features, notes, comments	Forever	Yes (project-scoped, shared)

Past prompts

Every message you send to the agent is captured to your private timeline, fire-and-forget — never blocks the prompt from reaching Claude. Per-user attribution by default. The next session's briefing surfaces a “Recent asks” line for the last 3 prompts in the past 24 hours; older prompts are searchable from the dashboard and through the agent's search tools.

Pinned summaries

By default, session summaries rotate after 30 days. Pin a summary to keep it indefinitely — this is for the strategic conversations you want to re-read three months from now (architectural debates, scope decisions, anything load-bearing for future work). Pinned summaries also rank higher in the next session's “Last conversation” section.

Relevant past observations & the <private> tag

On every session start, Sprintra builds a query from your project, repo path, and the latest commit subjects, then runs an on-device similarity search over your local cache. The top matches are appended to the briefing under “Relevant past observations.” If a session is irrelevant you'll never see it; if it's relevant you'll see it before you have to look for it.

Need to ask the agent something genuinely sensitive? Wrap your message in a <private>…</private> tag. Sprintra strips the tag from the prompt the agent sees and skips capture entirely — no cloud digest, no local embedding, no recall in any future session. Use this for one-off prompts that shouldn't leave any trail.

For a stronger guarantee, set strict_local_memory in your config. That mode disables every cloud round-trip the memory layer makes — no digest upload, no embed call — while keeping the on-device cache and your project workspace fully working. (Project sync for sprints, decisions, stories etc. continues normally; only the memory-layer cloud calls go quiet.)

Search every transcript on this machine

Claude Code already writes the raw transcript for every session to disk. Sprintra indexes those files in place — no copy made — and gives you full-text search across every session you've ever had on this machine. Sub-100 ms, no network required.

# One-time: scan your session history and build the index
sprintra transcript reindex

# List recent transcripts
sprintra transcript list --since 2026-04-01

# Keyword search across every session
sprintra transcript search "auth bug fix"

# Open the actual transcript file (no copy made)
sprintra transcript view <session-id>

# Disk usage / oldest / newest / retention
sprintra transcript status

# Manual prune (default cutoff: 90 days)
sprintra transcript prune --before 2026-01-01 --dry-run

Privacy and multi-user behavior

Three developers on the same repo and the same project. Each user's prompts and session summaries are private to them — the others cannot read them by default. Cross-user reads require admin or owner role (audit use case). The shared artifacts — features, decisions, notes, comments, work sessions — remain visible at the project level just like today.

Raw transcripts never sync to the cloud. They stay on the machine that produced them. The cross-machine “what was I doing yesterday” flow works through the digest (a few KB) which IS in the cloud. Need a specific transcript on another device? Run sprintra transcript share <id> --to-cloud for an explicit, single-transcript opt-in upload.

Three opt-out flags in your Sprintra config:

# ~/.sprintra/config.json

{
  "capture_user_prompts":     false,   // disables prompt capture
  "capture_session_digests":  false,   // disables session summaries
  "capture_transcript_index": false,   // disables local indexing
  "strict_local_memory":      true     // blocks all memory-layer cloud calls
}

To delete past data: purge cloud entries from the dashboard, or run sprintra transcript prune --before DATE for local.

How this compares

If you arrived here looking at memory plugins like claude-mem, the differences worth knowing about are below. The data is fair-game; framing both products positively.

Dimension	Sprintra Memory Layer	claude-mem
Cost per tool call	$0 — agent self-summary at Stop	LLM-priced compression service
Local cache	SQLite buffer + cached vectors	ChromaDB / on-disk vector store
Recall	On-device similarity search at session start	On-demand vector lookup
Linked to project state	Yes — sprints, decisions, stories, releases	No — standalone memory tool
Cloud retention	30 days, pinned forever, ~6 MB / heavy user / year	N/A (local-only by default)
Multi-user / per-user privacy	Built in; admin audit role	Single-machine
IDE coverage	Claude Code today; any MCP-capable agent via the same API	Claude Code only

Picking between them depends on your goal. If you want a self-contained memory cache for one developer on one machine, claude-mem is a clean choice. If you want memory that lives next to the rest of your project work and follows a team across machines, that's what Sprintra is designed for.

How this differs from Claude's Auto Memory

Claude's Auto Memory captures curated facts about you — your name, preferences, ongoing projects — and feeds them back across conversations. It's great for personalisation, and the two systems work fine alongside each other.

Sprintra is a different shape:

Dimension	Sprintra Memory Layer	Anthropic Auto Memory
What gets stored	Episodic session summaries + project artifacts	Curated personal facts
Recall mechanism	Similarity search over past sessions, on-device	Fact lookup on conversation context
Project linkage	Direct — tied to sprints, decisions, stories	None — conversation-scoped
Where it lives	Your machine + your Sprintra workspace	Anthropic-managed
Best for	Engineering teams: “what did we ship and why?”	Individual users: “remember my preferences”

Use both. Auto Memory keeps your conversation context personal; Sprintra keeps the project context shared and queryable.

Quickstart

# 1. Install the latest plugin
# Inside Claude Code:
/plugin marketplace add Sprintra-io/sprintra-mcp
/plugin install sprintra@sprintra

# 2. Install the CLI (with transcript subcommand)
npm install -g @sprintra/cli@latest

# 3. One-time backfill: index your existing transcripts
sprintra transcript reindex

# 4. Open a fresh Claude Code session — verify a "Relevant past
#    observations" section appears on session start once you have
#    a few sessions cached. The dashboard's Local memory cache tab
#    shows your cached count + last update.

Configuration reference

Setting	Where	Default
Cloud retention (days)	Org settings (admin)	30
Local retention (days)	`SPRINTRA_LOCAL_RETENTION_DAYS`	90
Capture user prompts	~/.sprintra/config.json	true
Capture session summaries	~/.sprintra/config.json	true
Capture transcript index	~/.sprintra/config.json	true
Strict local memory	~/.sprintra/config.json	false
Pin a summary (preserve indefinitely)	Dashboard or agent action	false
`<private>` prompt tag	Inline in your message	opt-in per prompt

Want the full story?

The architecture write-up: The Memory Layer architecture — zero LLM cost session capture. Multi-user behavior: /docs/teams.