How AI Agents Remember Context Across Interactions
Session memory, profile stores, semantic retrieval (RAG), and task state—governed with consent, TTLs, and audit logs.
Executive Summary
Direct answer: AI agents retain context with a layered memory stack: session memory (conversation history), long-term profile stores (CRM/CDP fields and preferences), semantic retrieval (vector/RAG over notes, emails, and docs), and task state (plans, timers, idempotency keys). They read/write through policy-checked APIs, log reason codes, and refresh or expire memories based on recency, consent, and accuracy—so each new interaction starts informed but auditable.
Guiding Principles
Memory Playbook
Step | What to do | Output | Owner | Timeframe |
---|---|---|---|---|
1 — Model | Define memory schema (facts, preferences, state) | Field dictionary + tags (PII, TTL) | MOPs / Data Ops | 1 week |
2 — Connect | Wire CRM/CDP, vector index, and file stores | Read/write APIs + policies | Platform Owner | 1–2 weeks |
3 — Capture | Ingest notes, outcomes, events with reason codes | Structured memories | AI Lead | Ongoing |
4 — Retrieve | Add RAG prompts, ranking, and citations | Relevant context per interaction | AI/Eng | Days |
5 — Govern | Apply decay, consent, redaction, and deletion | Fresh, compliant memory set | Governance Board | Ongoing |
How Memory Works (Expanded)
“Memory” is more than chat history. Effective agents combine four layers. First, session memory holds the running conversation and recent tool outputs. Second, a profile store (CRM/CDP) captures durable facts—identity, preferences, roles, lifecycle stage—written only through approved fields and with consent tags. Third, semantic memory uses retrieval-augmented generation (RAG): vectors over transcripts, emails, tickets, and briefs so the agent can cite context without memorizing raw text. Fourth, task/state memory tracks plans, checkpoints, timers, and idempotency keys so multi-step work resumes safely after interruptions.
Governance is essential. Mark fields with provenance, last-verified date, and TTL/decay so stale data expires or is re-validated. Restrict access by role and region; encrypt sensitive fields; and log every read/write with correlation IDs. Retrieval should favor fresh, high-confidence sources and include snippets as citations in outputs. Weekly reviews examine memory errors (stale, missing, or hallucinated), adjust schemas and prompts, and tune decay windows.
At TPG, we treat memory as a product—schema, owner, SLOs, and auditability—so agents stay helpful without accumulating risk. Why TPG? Our consultants implement governed RAG and CRM/CDP integrations across enterprise stacks with policy validators and tracing.
Metrics & Benchmarks
Metric | Formula | Target/Range | Stage | Notes |
---|---|---|---|---|
Memory hit rate | Interactions with useful retrieval ÷ total | Trending up | Retrieve | Guard for quality, not spam |
Recall precision | Correct facts cited ÷ facts cited | ≥ 95% | Generate | Sample via audits |
Stale-memory incidents | Corrections triggered ÷ interactions | Trending down | Govern | Use TTL/decay |
P95 retrieval latency | 95th percentile time to fetch context | Within SLA | Execute | Balance depth vs speed |
Consent-safe access | Allowed reads/writes ÷ attempted | 100% | All | Blocked by policy |
Frequently Asked Questions
Durable facts belong in CRM/CDP fields; transient details stay in vector stores or session state with TTL and provenance.
Require retrieval citations, prefer high-confidence sources, and block unsupported claims via validators.
Yes—use a unified ID and event stream so all channels write to the same governed memory schema.
Sensitive PII beyond necessity, secrets, and regulated content without consent; always minimize, encrypt, and set strict retention.
Provide user-level deletion tools, apply TTL/decay, and propagate erasures to downstream indexes with audit logs.