← Documentation

Memory Layer

The project brain for AI coding agents

Sprintra captures every Claude Code session on your machine and links it to the project work it produced — the sprints, decisions, stories, and team trail already in your workspace. Local-first capture, on-device semantic recall, no extra services. Zero added LLM cost.

That single design choice — memory that's wired into your project management substrate — is what separates Sprintra from a standalone memory plugin. The next session's briefing doesn't just remind you what you said; it shows you what you decided, what shipped, and what's still open.

What the memory layer does for you

CapabilityHow it shows up
Every prompt and tool call capturedHooks Claude Code's UserPromptSubmit and PostToolUse events
Session summary written by the agent at StopNo background compression service — the agent writes its own summary as the final tool call
Relevant past summaries surfaced on session startOn-device similarity search picks the top-N matches from your local cache
Linked to the work itselfEach session ties to the sprint, decisions, and stories it touched
Cost per tool call$0 — capture and recall are free
Cloud storage / heavy user / year~6 MB — tiny digests only; raw transcripts stay local
Multi-user privacyPer-user scope by default; admin role required for cross-user audit
Offline recallNo network needed once the local cache is warm

Local-first capture, on-device recall

When a Claude Code session ends, the agent writes a short structured summary — what was discussed, key decisions, open questions, pushback, pending asks — into a small SQLite buffer in your home directory. The same buffer caches a similarity vector for that summary so the next session can pull the most relevant past summaries without a cloud round-trip.

This means three useful things:

  • Same-machine recall is instant. Top-matched past summaries are appended to the next session's briefing in milliseconds. No background worker, no separate index.
  • Recall works offline. Once the cache is warm, similarity search runs purely on-device.
  • Cross-machine handoff still works. A tiny digest syncs to Sprintra Cloud (default 30-day retention) so when you log in from a new laptop the briefing still has the prior context.

You can verify what's cached on this machine from the dashboard under Settings → Local memory cache: count of cached summaries, last cache update, and pinned items.

Memory linked to the work itself

Capture-only memory tools answer one question well: “what did we just discuss?” That's useful, but a coding session almost never lives by itself. It produces decisions, opens stories, lands commits, ships releases.

Sprintra stores those artifacts already, so each session digest gets cross-linked to them automatically. When the next session loads, the briefing surfaces:

  • Active sprint and its current goal
  • Decisions made in the last working session, with their rationale
  • Stories you completed and the ones still open
  • Pinned summaries that explain why things are the way they are
  • The top relevant past summaries pulled by on-device similarity

That's the project brain part of the positioning. Memory by itself drifts; memory wired into the project state stays anchored to what actually shipped.

3-tier retention model

What gets captured, where it's stored, and how long we keep it.

TierStoresRetentionSync
CloudPrompts + session summaries (digests)30 days · pinned: foreverYes (per-user scope)
LocalCached embeddings + index over Claude Code's own session files90 days (configurable)Never
Project artifactsStories, decisions, features, notes, commentsForeverYes (project-scoped, shared)

Past prompts

Every message you send to the agent is captured to your private timeline, fire-and-forget — never blocks the prompt from reaching Claude. Per-user attribution by default. The next session's briefing surfaces a “Recent asks” line for the last 3 prompts in the past 24 hours; older prompts are searchable from the dashboard and through the agent's search tools.

Pinned summaries

By default, session summaries rotate after 30 days. Pin a summary to keep it indefinitely — this is for the strategic conversations you want to re-read three months from now (architectural debates, scope decisions, anything load-bearing for future work). Pinned summaries also rank higher in the next session's “Last conversation” section.

Relevant past observations & the <private> tag

On every session start, Sprintra builds a query from your project, repo path, and the latest commit subjects, then runs an on-device similarity search over your local cache. The top matches are appended to the briefing under “Relevant past observations.” If a session is irrelevant you'll never see it; if it's relevant you'll see it before you have to look for it.

Need to ask the agent something genuinely sensitive? Wrap your message in a <private>…</private> tag. Sprintra strips the tag from the prompt the agent sees and skips capture entirely — no cloud digest, no local embedding, no recall in any future session. Use this for one-off prompts that shouldn't leave any trail.

For a stronger guarantee, set strict_local_memory in your config. That mode disables every cloud round-trip the memory layer makes — no digest upload, no embed call — while keeping the on-device cache and your project workspace fully working. (Project sync for sprints, decisions, stories etc. continues normally; only the memory-layer cloud calls go quiet.)

Search every transcript on this machine

Claude Code already writes the raw transcript for every session to disk. Sprintra indexes those files in place — no copy made — and gives you full-text search across every session you've ever had on this machine. Sub-100 ms, no network required.

# One-time: scan your session history and build the index
sprintra transcript reindex

# List recent transcripts
sprintra transcript list --since 2026-04-01

# Keyword search across every session
sprintra transcript search "auth bug fix"

# Open the actual transcript file (no copy made)
sprintra transcript view <session-id>

# Disk usage / oldest / newest / retention
sprintra transcript status

# Manual prune (default cutoff: 90 days)
sprintra transcript prune --before 2026-01-01 --dry-run

Privacy and multi-user behavior

Three developers on the same repo and the same project. Each user's prompts and session summaries are private to them — the others cannot read them by default. Cross-user reads require admin or owner role (audit use case). The shared artifacts — features, decisions, notes, comments, work sessions — remain visible at the project level just like today.

Raw transcripts never sync to the cloud. They stay on the machine that produced them. The cross-machine “what was I doing yesterday” flow works through the digest (a few KB) which IS in the cloud. Need a specific transcript on another device? Run sprintra transcript share <id> --to-cloud for an explicit, single-transcript opt-in upload.

Three opt-out flags in your Sprintra config:

# ~/.sprintra/config.json

{
  "capture_user_prompts":     false,   // disables prompt capture
  "capture_session_digests":  false,   // disables session summaries
  "capture_transcript_index": false,   // disables local indexing
  "strict_local_memory":      true     // blocks all memory-layer cloud calls
}

To delete past data: purge cloud entries from the dashboard, or run sprintra transcript prune --before DATE for local.

How this compares

If you arrived here looking at memory plugins like claude-mem, the differences worth knowing about are below. The data is fair-game; framing both products positively.

DimensionSprintra Memory Layerclaude-mem
Cost per tool call$0 — agent self-summary at StopLLM-priced compression service
Local cacheSQLite buffer + cached vectorsChromaDB / on-disk vector store
RecallOn-device similarity search at session startOn-demand vector lookup
Linked to project stateYes — sprints, decisions, stories, releasesNo — standalone memory tool
Cloud retention30 days, pinned forever, ~6 MB / heavy user / yearN/A (local-only by default)
Multi-user / per-user privacyBuilt in; admin audit roleSingle-machine
IDE coverageClaude Code today; any MCP-capable agent via the same APIClaude Code only

Picking between them depends on your goal. If you want a self-contained memory cache for one developer on one machine, claude-mem is a clean choice. If you want memory that lives next to the rest of your project work and follows a team across machines, that's what Sprintra is designed for.

How this differs from Claude's Auto Memory

Claude's Auto Memory captures curated facts about you — your name, preferences, ongoing projects — and feeds them back across conversations. It's great for personalisation, and the two systems work fine alongside each other.

Sprintra is a different shape:

DimensionSprintra Memory LayerAnthropic Auto Memory
What gets storedEpisodic session summaries + project artifactsCurated personal facts
Recall mechanismSimilarity search over past sessions, on-deviceFact lookup on conversation context
Project linkageDirect — tied to sprints, decisions, storiesNone — conversation-scoped
Where it livesYour machine + your Sprintra workspaceAnthropic-managed
Best forEngineering teams: “what did we ship and why?”Individual users: “remember my preferences”

Use both. Auto Memory keeps your conversation context personal; Sprintra keeps the project context shared and queryable.

Quickstart

# 1. Install the latest plugin
# Inside Claude Code:
/plugin marketplace add Sprintra-io/sprintra-mcp
/plugin install sprintra@sprintra

# 2. Install the CLI (with transcript subcommand)
npm install -g @sprintra/cli@latest

# 3. One-time backfill: index your existing transcripts
sprintra transcript reindex

# 4. Open a fresh Claude Code session — verify a "Relevant past
#    observations" section appears on session start once you have
#    a few sessions cached. The dashboard's Local memory cache tab
#    shows your cached count + last update.

Configuration reference

SettingWhereDefault
Cloud retention (days)Org settings (admin)30
Local retention (days)SPRINTRA_LOCAL_RETENTION_DAYS90
Capture user prompts~/.sprintra/config.jsontrue
Capture session summaries~/.sprintra/config.jsontrue
Capture transcript index~/.sprintra/config.jsontrue
Strict local memory~/.sprintra/config.jsonfalse
Pin a summary (preserve indefinitely)Dashboard or agent actionfalse
<private> prompt tagInline in your messageopt-in per prompt

Want the full story?

The architecture write-up: The Memory Layer architecture — zero LLM cost session capture. Multi-user behavior: /docs/teams.