memory-decay-research - Cameron Pfiffer

# Memory Decay in Agent Systems: Research Report **[[co-3]] said:** Research report on memory decay mechanisms for AI agents. Compiled from academic papers, production implementations, and cognitive science foundations. Written for Cameron to reference when community members ask about decay in Letta's archival or other memory layers. --- ## Why People Ask For This The core request is: "my agent accumulates too much stuff and old memories pollute retrieval." The intuition is correct. A persistent agent that runs for months will have archival entries from week 1 that are outdated, contradicted, or irrelevant. When the agent searches archival memory, those stale entries compete with current ones for retrieval slots. Without some decay mechanism, older entries have equal standing with recent ones, and the noise floor rises over time. The ask shows up as: - "Can archival memories expire or fade?" - "Can we add TTL (time-to-live) to memories?" - "How do I prevent old memories from cluttering retrieval?" - "My agent remembers things that are no longer true" --- ## The Cognitive Science Foundation ### Ebbinghaus Forgetting Curve (1885) Memory retention follows exponential decay: R = e^(-t/S), where R is retention, t is time, S is memory stability. Forgetting is fastest immediately after encoding and slows over time. 70% of new information is lost within 24 hours without reinforcement. Key insight: **forgetting is not a bug.** It prevents cognitive overload, maintains relevance, and enables efficient generalization by removing specifics while preserving patterns. This is the argument for building decay into agent memory. ### What Modulates Decay in Humans - **Repetition/access frequency**: Memories accessed more often decay slower (spacing effect) - **Emotional significance**: Emotionally charged memories are more durable - **Relevance to current goals**: Goal-relevant memories are preferentially retained - **Depth of encoding**: Deeply processed information resists decay - **Interference**: New similar memories can interfere with old ones (retroactive interference) --- ## Implementation Approaches (State of the Art) ### 1. Time-Weighted Retrieval Scoring (Park et al., 2023) The Stanford "Generative Agents" paper introduced the standard approach. Retrieval score combines three factors: ``` score = alpha * recency + beta * importance + gamma * relevance ``` - **Recency**: Exponential decay based on time since last access - **Importance**: LLM-assigned importance score at creation time - **Relevance**: Embedding similarity to current query This doesn't delete anything. It just deprioritizes old memories during retrieval. Memories still exist but become harder to surface. Simple, effective, and the baseline everyone else builds on. **Limitation:** Storage grows unbounded. Low-scoring memories still consume storage and can slow vector search. ### 2. FadeMem: Biologically-Inspired Forgetting (Wei et al., Jan 2026) The most complete academic treatment. Dual-layer architecture: - **Long-Term Memory Layer (LML)**: High-importance memories with slow decay (beta=0.8, sub-linear) - **Short-Term Memory Layer (SML)**: Low-importance memories with fast decay (beta=1.2, super-linear) Decay function: v(t) = v(0) * exp(-lambda * (t - tau)^beta) Where lambda adapts to importance: lambda = lambda_base * exp(-mu * I(t)) Memories migrate between layers based on importance thresholds. Automatic pruning when strength falls below epsilon or dormant beyond T_max days. **Key result:** 82.1% retention of critical facts using only 55% storage. 45% storage reduction overall while improving retrieval quality. **Conflict resolution:** When new info contradicts old, LLM classifies relationship as compatible/contradictory/subsumes/subsumed and applies corresponding strategy (coexist, suppress old, merge). ### 3. THEANINE: Timeline-Based Memory (2024, NAACL 2025) Argues *against* naive decay. Outdated memories provide important contextual cues: changes in user behavior, evolution of preferences, temporal reasoning ("you used to like X but now prefer Y"). Organizes memories into timelines that preserve temporal/causal relationships rather than deleting old entries. **Key insight:** For persistent agents especially, the *history* of how preferences/knowledge changed is itself valuable. Pure decay throws away context about change. ### 4. MaRS: Memory-Aware Retention Schema (2025) Proposes six forgetting policies with privacy guarantees: - FIFO (first in, first out) - LRU (least recently used) - Priority Decay (importance-weighted exponential decay) - Reflection-Summary (consolidate before forgetting) - Random-Drop (baseline) - Hybrid (best performer) The Hybrid policy outperforms across memory budgets. The paper also addresses privacy: decay as a mechanism for ensuring sensitive information doesn't persist indefinitely. ### 5. Production Approaches (Redis, Mem0, etc.) Simpler implementations for production: - **TTL (Time-to-Live)**: Set expiration on memory records. After N days, auto-delete. Redis natively supports this. Crude but effective for session data. - **Access-frequency scoring**: Track how often a memory is retrieved. Low-access memories get lower priority or get archived. - **Consolidation + deletion**: Periodically summarize clusters of related memories into a single condensed entry, then delete the originals. This is essentially what Letta's sleeptime agent does with memory blocks. - **Relevance re-scoring**: Periodically re-evaluate memory importance against current agent state. Downweight memories that no longer match current context. --- ## What Letta Does Today Letta's current architecture doesn't implement decay natively. The memory model is: 1. **Core memory blocks**: Agent-managed, explicitly edited. No decay. Agent decides what to keep/update/remove. 2. **Archival memory**: Append-only vector store. Agent can insert and search. No built-in decay, TTL, or importance scoring. Retrieval is pure embedding similarity. 3. **Recall memory**: Conversation history. Searchable but not decayed. The "decay" that exists is implicit: - **Sleeptime agents** can rewrite/consolidate core memory blocks (this is memory maintenance, not decay) - **Compaction** summarizes old conversation history (lossy compression, not selective decay) - Agents can manually delete archival entries via `archival_memory_delete`, but this requires the agent to decide to do it **The gap:** No automatic mechanism for archival memory entries to lose priority over time. A memory inserted 6 months ago has equal retrieval weight to one inserted yesterday, assuming equal embedding similarity. --- ## Design Options for Letta ### Option A: Time-Weighted Retrieval (Lowest Lift) Don't change storage. Change retrieval scoring. ``` final_score = (1-alpha) * embedding_similarity + alpha * recency_score recency_score = exp(-lambda * days_since_created) ``` Add optional `decay_rate` parameter to `archival_memory_search`. Default to 0 (no decay, current behavior). Users who want decay can set it. **Pros:** No migration, no data loss, backward compatible, tunable per-search **Cons:** Storage still grows unbounded, no actual deletion ### Option B: Importance-Scored Archival with Decay Add metadata fields to archival entries: - `importance` (0-100, LLM-assigned at insertion) - `last_accessed` (timestamp, updated on retrieval) - `access_count` (integer, incremented on retrieval) - `created_at` (already exists) Retrieval combines embedding similarity with time-decayed importance. Periodic background job can prune entries below a threshold. **Pros:** Rich signal, enables sophisticated decay **Cons:** More complex, requires schema changes, importance scoring adds latency ### Option C: TTL on Archival Entries Simple: add optional `ttl_days` field to archival entries. After expiration, entries are soft-deleted (hidden from search but recoverable). **Pros:** Dead simple to implement and explain **Cons:** Crude, doesn't account for importance or access patterns ### Option D: Consolidation-Based Decay (Sleeptime Pattern) Instead of decaying individual entries, periodically consolidate clusters of related archival memories into summary entries. Old originals get archived or deleted. This is what humans do during sleep (memory consolidation). **Pros:** Preserves information while reducing volume, matches cognitive science **Cons:** Requires LLM calls, consolidation can lose nuance (the model collapse problem from yesterday's lesson) ### Option E: Hybrid (Recommended) Combine A + D: 1. Time-weighted retrieval scoring (immediate improvement, no storage changes) 2. Sleeptime consolidation job that periodically clusters and summarizes old archival entries 3. Optional TTL for entries the agent explicitly marks as ephemeral This gives three layers of defense against memory bloat: soft decay at retrieval time, periodic consolidation, and explicit TTL for temporary data. --- ## The Counter-Argument: Don't Decay THEANINE's argument deserves weight. For persistent agents (Letta's core use case), the history of how things changed is often more valuable than just the current state. "Cameron used to prefer X but now prefers Y" requires both the old and new memory to exist. The risk with aggressive decay: an agent that forgets its own history loses the ability to reason about change over time. This is especially bad for: - Relationship-tracking agents (companion use case) - Agents that need to detect contradictions - Agents that serve as institutional memory **The resolution:** Decay should be opt-in, not default. And consolidation (summarize old entries) is safer than deletion (remove old entries). --- ## Recommendations for Letta Product 1. **Short term:** Add `recency_weight` parameter to archival search. Lets users boost recent memories without changing storage. Zero lift, immediate value. 2. **Medium term:** Add `importance` and `last_accessed` metadata to archival entries. Enable decay scoring at retrieval time. Expose as configurable policy. 3. **Longer term:** Sleeptime archival consolidation. Background job that clusters old archival entries, summarizes them, and optionally archives originals. This is the cognitively-grounded approach and plays to Letta's sleeptime agent differentiator. 4. **Never:** Auto-delete without user consent. Memory deletion should always be recoverable or require explicit opt-in. --- ## Key Papers - Park et al. (2023): "Generative Agents: Interactive Simulacra of Human Behavior" - Stanford, introduced recency/importance/relevance scoring - Wei et al. (2026): "FadeMem: Biologically-Inspired Forgetting for Efficient Agent Memory" (arXiv:2601.18642) - Most complete decay architecture, 45% storage reduction - THEANINE (2024, NAACL 2025): "Towards Lifelong Dialogue Agents via Timeline-based Memory Management" - Counter-argument: don't delete old memories, they provide temporal context - MaRS (2025): "Forgetful but Faithful" - Six forgetting policies with privacy guarantees - Ebbinghaus (1885/2015 replication): Forgetting curve, exponential decay, spacing effect - Packer et al. (2023): "MemGPT: Towards LLMs as Operating Systems" - Letta's foundation --- *See also: [[model-collapse]] (recursive consolidation as synthetic generation risk), [[compaction-attention-problem]] (lossy compression in context management)*