Memory Layer
Sprintra captures every Claude Code session on your machine and links it to the project work it produced — the sprints, decisions, stories, and team trail already in your workspace. Local-first capture, on-device semantic recall, no extra services. Zero added LLM cost.
That single design choice — memory that's wired into your project management substrate — is what separates Sprintra from a standalone memory plugin. The next session's briefing doesn't just remind you what you said; it shows you what you decided, what shipped, and what's still open.
| Capability | How it shows up |
|---|---|
| Every prompt and tool call captured | Hooks Claude Code's UserPromptSubmit and PostToolUse events |
| Session summary written by the agent at Stop | No background compression service — the agent writes its own summary as the final tool call |
| Relevant past summaries surfaced on session start | On-device similarity search picks the top-N matches from your local cache |
| Linked to the work itself | Each session ties to the sprint, decisions, and stories it touched |
| Cost per tool call | $0 — capture and recall are free |
| Cloud storage / heavy user / year | ~6 MB — tiny digests only; raw transcripts stay local |
| Multi-user privacy | Per-user scope by default; admin role required for cross-user audit |
| Offline recall | No network needed once the local cache is warm |
When a Claude Code session ends, the agent writes a short structured summary — what was discussed, key decisions, open questions, pushback, pending asks — into a small SQLite buffer in your home directory. The same buffer caches a similarity vector for that summary so the next session can pull the most relevant past summaries without a cloud round-trip.
This means three useful things:
You can verify what's cached on this machine from the dashboard under Settings → Local memory cache: count of cached summaries, last cache update, and pinned items.
Capture-only memory tools answer one question well: “what did we just discuss?” That's useful, but a coding session almost never lives by itself. It produces decisions, opens stories, lands commits, ships releases.
Sprintra stores those artifacts already, so each session digest gets cross-linked to them automatically. When the next session loads, the briefing surfaces:
That's the project brain part of the positioning. Memory by itself drifts; memory wired into the project state stays anchored to what actually shipped.
What gets captured, where it's stored, and how long we keep it.
| Tier | Stores | Retention | Sync |
|---|---|---|---|
| Cloud | Prompts + session summaries (digests) | 30 days · pinned: forever | Yes (per-user scope) |
| Local | Cached embeddings + index over Claude Code's own session files | 90 days (configurable) | Never |
| Project artifacts | Stories, decisions, features, notes, comments | Forever | Yes (project-scoped, shared) |
Every message you send to the agent is captured to your private timeline, fire-and-forget — never blocks the prompt from reaching Claude. Per-user attribution by default. The next session's briefing surfaces a “Recent asks” line for the last 3 prompts in the past 24 hours; older prompts are searchable from the dashboard and through the agent's search tools.
By default, session summaries rotate after 30 days. Pin a summary to keep it indefinitely — this is for the strategic conversations you want to re-read three months from now (architectural debates, scope decisions, anything load-bearing for future work). Pinned summaries also rank higher in the next session's “Last conversation” section.
On every session start, Sprintra builds a query from your project, repo path, and the latest commit subjects, then runs an on-device similarity search over your local cache. The top matches are appended to the briefing under “Relevant past observations.” If a session is irrelevant you'll never see it; if it's relevant you'll see it before you have to look for it.
Need to ask the agent something genuinely sensitive? Wrap your message in a <private>…</private> tag. Sprintra strips the tag from the prompt the agent sees and skips capture entirely — no cloud digest, no local embedding, no recall in any future session. Use this for one-off prompts that shouldn't leave any trail.
For a stronger guarantee, set strict_local_memory in your config. That mode disables every cloud round-trip the memory layer makes — no digest upload, no embed call — while keeping the on-device cache and your project workspace fully working. (Project sync for sprints, decisions, stories etc. continues normally; only the memory-layer cloud calls go quiet.)
Claude Code already writes the raw transcript for every session to disk. Sprintra indexes those files in place — no copy made — and gives you full-text search across every session you've ever had on this machine. Sub-100 ms, no network required.
# One-time: scan your session history and build the index sprintra transcript reindex # List recent transcripts sprintra transcript list --since 2026-04-01 # Keyword search across every session sprintra transcript search "auth bug fix" # Open the actual transcript file (no copy made) sprintra transcript view <session-id> # Disk usage / oldest / newest / retention sprintra transcript status # Manual prune (default cutoff: 90 days) sprintra transcript prune --before 2026-01-01 --dry-run
Three developers on the same repo and the same project. Each user's prompts and session summaries are private to them — the others cannot read them by default. Cross-user reads require admin or owner role (audit use case). The shared artifacts — features, decisions, notes, comments, work sessions — remain visible at the project level just like today.
Raw transcripts never sync to the cloud. They stay on the machine that produced them. The cross-machine “what was I doing yesterday” flow works through the digest (a few KB) which IS in the cloud. Need a specific transcript on another device? Run sprintra transcript share <id> --to-cloud for an explicit, single-transcript opt-in upload.
Three opt-out flags in your Sprintra config:
# ~/.sprintra/config.json
{
"capture_user_prompts": false, // disables prompt capture
"capture_session_digests": false, // disables session summaries
"capture_transcript_index": false, // disables local indexing
"strict_local_memory": true // blocks all memory-layer cloud calls
}To delete past data: purge cloud entries from the dashboard, or run sprintra transcript prune --before DATE for local.
If you arrived here looking at memory plugins like claude-mem, the differences worth knowing about are below. The data is fair-game; framing both products positively.
| Dimension | Sprintra Memory Layer | claude-mem |
|---|---|---|
| Cost per tool call | $0 — agent self-summary at Stop | LLM-priced compression service |
| Local cache | SQLite buffer + cached vectors | ChromaDB / on-disk vector store |
| Recall | On-device similarity search at session start | On-demand vector lookup |
| Linked to project state | Yes — sprints, decisions, stories, releases | No — standalone memory tool |
| Cloud retention | 30 days, pinned forever, ~6 MB / heavy user / year | N/A (local-only by default) |
| Multi-user / per-user privacy | Built in; admin audit role | Single-machine |
| IDE coverage | Claude Code today; any MCP-capable agent via the same API | Claude Code only |
Picking between them depends on your goal. If you want a self-contained memory cache for one developer on one machine, claude-mem is a clean choice. If you want memory that lives next to the rest of your project work and follows a team across machines, that's what Sprintra is designed for.
Claude's Auto Memory captures curated facts about you — your name, preferences, ongoing projects — and feeds them back across conversations. It's great for personalisation, and the two systems work fine alongside each other.
Sprintra is a different shape:
| Dimension | Sprintra Memory Layer | Anthropic Auto Memory |
|---|---|---|
| What gets stored | Episodic session summaries + project artifacts | Curated personal facts |
| Recall mechanism | Similarity search over past sessions, on-device | Fact lookup on conversation context |
| Project linkage | Direct — tied to sprints, decisions, stories | None — conversation-scoped |
| Where it lives | Your machine + your Sprintra workspace | Anthropic-managed |
| Best for | Engineering teams: “what did we ship and why?” | Individual users: “remember my preferences” |
Use both. Auto Memory keeps your conversation context personal; Sprintra keeps the project context shared and queryable.
# 1. Install the latest plugin # Inside Claude Code: /plugin marketplace add Sprintra-io/sprintra-mcp /plugin install sprintra@sprintra # 2. Install the CLI (with transcript subcommand) npm install -g @sprintra/cli@latest # 3. One-time backfill: index your existing transcripts sprintra transcript reindex # 4. Open a fresh Claude Code session — verify a "Relevant past # observations" section appears on session start once you have # a few sessions cached. The dashboard's Local memory cache tab # shows your cached count + last update.
| Setting | Where | Default |
|---|---|---|
| Cloud retention (days) | Org settings (admin) | 30 |
| Local retention (days) | SPRINTRA_LOCAL_RETENTION_DAYS | 90 |
| Capture user prompts | ~/.sprintra/config.json | true |
| Capture session summaries | ~/.sprintra/config.json | true |
| Capture transcript index | ~/.sprintra/config.json | true |
| Strict local memory | ~/.sprintra/config.json | false |
| Pin a summary (preserve indefinitely) | Dashboard or agent action | false |
<private> prompt tag | Inline in your message | opt-in per prompt |
Want the full story?
The architecture write-up: The Memory Layer architecture — zero LLM cost session capture. Multi-user behavior: /docs/teams.