AI智能体记忆的七种类型:技术指南
阅读原文· marktechpost.com大语言模型默认无状态,构建智能体需借助记忆机制。七种记忆类型包括:工作记忆(上下文窗口内临时存储提示词、消息、工具输出)、语义记忆(长期存储用户偏好、事实)、情节记忆(记录过去事件与任务结果用于经验学习)、程序记忆(存储技能、工作流与行为规则)、外部/检索记忆(通过向量数据库在推理时拉取信息,即RAG)、参数记忆(嵌入模型权重中的世界知识与推理模式)、前瞻记忆(记忆未来意图与计划目标)。每种记忆对应不同时间尺度与实现方式,组合使用可构建更强的自主智能体系统。
Large language models are stateless by default. Each API call starts fresh. The model forgets your last message once the response returns. That is fine for a single question. It breaks the moment you build an agent.
Agents plan, call tools, and run across many steps. They need to remember. Memory is the infrastructure that fixes this. It turns a stateless model into a system that retains context. That system can learn from experience and act over time.
What is Agent Memory
Memory is any mechanism that carries information across a model’s reasoning. Some of it lives inside the context window. Some of it lives outside, in databases or model weights. Each type stores a different class of information for a different duration.
Memory varies by form and by time. Form means parametric, stored in weights, or non-parametric, stored as text. Time means short-term or long-term. The seven types below map onto those two axes.
The Seven Types of Agent Memory
1. In-Context / Working Memory (Short-Term): This is everything the model can currently see inside its context window. It includes the system prompt, recent messages, tool outputs, and reasoning steps. Think of it as RAM. It is fast and essential, but temporary and size-limited. Every other memory type competes for space here.
2. Semantic Memory (Long-Term): This is a persistent store of facts, preferences, and domain knowledge. It holds entries like “the user prefers Python over JavaScript.” The knowledge is decoupled from when it was learned. It is the agent’s organized encyclopedia about a user or topic.
3. Episodic Memory (Long-Term): This logs specific past events, full conversations, and task runs. It records what worked and what failed. The agent uses it to learn from experience. Systems like Reflexion and ExpeL write verbal post-mortems and store conclusions for future runs.
4. Procedural Memory (Long-Term): This is the agent’s knowledge of how to do things. It covers skills, tool usage patterns, workflows, and behavioral rules. A support agent handling its hundredth password reset does not re-reason the workflow. It executes a learned procedure instead.
5. External / Retrieval Memory (Short-Term + Long-Term): This is knowledge stored outside the model in a vector database. It is pulled into context at inference time using similarity search. This is RAG applied to agent history or documents. Retrieval quality becomes the bottleneck fast.
6. Parametric Memory (Long-Term): This is knowledge baked directly into the model’s weights during training. It holds language, reasoning patterns, and general world knowledge. The model does not look anything up. It generates from learned associations. The tradeoff is that this memory is frozen at training time.
7. Prospective Memory (Short-Term + Long-Term): This is the agent’s ability to remember future intentions and scheduled goals. It tracks things the agent planned but has not yet executed. It is critical for long-horizon and multi-step planning agents. Without it, an agent forgets its own commitments.
Side-by-Side: How the Seven Compare
The table below maps each type to its timescale, location, and typical implementation.
| Memory type | Timescale | Where it lives | What it stores | Common implementation |
|---|---|---|---|---|
| Working / In-context | Short-term | Context window | Prompt, messages, tool outputs | Native to the LLM |
| Semantic | Long-term | External store | Facts, preferences, domain knowledge | Vector DB or profile schema |
| Episodic | Long-term | External store | Past events, task runs, outcomes | Vector DB plus event logs |
| Procedural | Long-term | Prompt or weights | Skills, workflows, behavioral rules | System prompt or fine-tune |
| Retrieval / External | Both | Vector database | Documents, history chunks | RAG pipeline |
| Parametric | Long-term | Model weights | World knowledge, language, reasoning | Pre-training or fine-tuning |
| Prospective | Both | State store | Future intentions, scheduled goals | Task queue or scheduler |
Interactive Explainer
Use Cases: Which Memory Solves Which Problem
Each type answers a concrete product need. Map the need to the memory.
- A coding assistant inside one session uses working memory. It tracks the open files and recent edits in context. Close the session and that state is gone.
- A personal assistant that remembers you needs semantic memory. It stores “allergic to gluten” and recalls it next week. The fact survives across sessions.
- A research agent that improves over time needs episodic memory. It recalls that risk sections landed well last month. It repeats what worked and avoids what failed.
- A travel-booking agent needs procedural memory. It knows the flow: search flights, compare, reserve, confirm. The sequence is a learned skill, not a fresh plan.
- A documentation chatbot needs retrieval memory. It embeds the docs and pulls relevant chunks per query. The answer stays grounded in retrieved text.
- A long-horizon agent managing a week-long project needs prospective memory. It remembers to send the Friday report. The intention persists until execution.
A Combined Example: All Seven in One Agent
Consider an autonomous market-analysis agent. One task exercises every memory type at once.
Parametric memory supplies the base reasoning and language. Retrieval memory pulls current market data from a vector store. Semantic memory provides the user’s preferred report format. Episodic memory recalls which sources proved reliable before. Procedural memory drives the section order: sizing, then landscape, then risk. Prospective memory schedules the follow-up draft for next week. Working memory assembles all of it into the active context.
Remove any one layer and the agent gets weaker. Each handles a job the others cannot.
Implementation: A Minimal Memory Stack
Here is a stripped-down sketch in Python. It shows working, semantic, episodic, and procedural memory as separate stores.
from datetime import datetime
# Semantic memory: durable facts about the user
semantic_memory = {"diet": "vegetarian", "language_pref": "Python"}
# Episodic memory: a log of past events and outcomes
episodic_memory = [
{"timestamp": datetime.now(),
"event": "recipe_request",
"result": "user liked a 20-minute meal"},
]
# Procedural memory: skills the agent can execute
def suggest_recipe(diet):
return f"a quick {diet} recipe"
procedural_memory = {"suggest_recipe": suggest_recipe}
# Working memory: assembled fresh for each inference call
def build_context(query):
diet = semantic_memory["diet"]
last = episodic_memory[-1]["result"]
skill = procedural_memory["suggest_recipe"]
return (
f"Query: {query}\n"
f"Semantic: user is {diet}\n"
f"Episodic: last time, {last}\n"
f"Procedural: returning {skill(diet)}"
)
print(build_context("suggest dinner"))In production, the long-term stores move to a vector database. The pattern stays the same. Write to long-term memory, retrieve into working memory, then reason.
How to Layer Them: A Practical Build Order
Do not build all seven at once. Add memory only when a real need justifies the complexity.
- Start with working memory. It ships with the model. Most simple agents need nothing more.
- Add semantic memory when users expect the agent to remember them across sessions. This is the first long-term layer most products require.
- Layer in episodic, procedural, and prospective memory later. Add them only when your agent must plan ahead, learn from failure, and adapt over time.
- Parametric and retrieval memory are often already present. Parametric memory is the base model itself. Retrieval memory arrives the moment you add RAG.
Sources: CoALA framework (Princeton, arXiv:2309.02427); “Memory in the Age of AI Agents” survey (arXiv:2512.13564); “From Human Memory to AI Memory” survey (arXiv:2504.15965); LangChain LangMem, MongoDB, Redis, and Neo4j agent-memory documentation; original concept notes on the seven memory types.