Back to archiveMay 27, 2026

Insight archive

More Than a Good Prompt: The 4 Memory Types of AI Agents

What makes an AI agent useful at enterprise level is not a long prompt or tool access. It is how the system handles working memory, durable knowledge, procedures, and past experience.

Categoryblog

Read time

6 min

More Than a Good Prompt: The 4 Memory Types of AI Agents

Almost everything gets called an agent now. If a chatbot can call a tool — for example, access your calendar and create a new event when you ask — answer in multiple steps, or run with a longer system prompt, the label is applied immediately.

That does not make it a real enterprise-grade AI agent. Simple AI agents are easy to create without coding; I wrote about that in this article.

The difference is not visible in the interface. It shows up in how the system behaves. A traditional chat is reactive: you ask, it answers, and in the next conversation you often start from scratch. A well-designed AI agent does more. It treats the current context, durable knowledge, execution patterns, and previous experience as separate things.

In short: the prompt does not make the AI agent. The memory architecture does.

The CoALA framework — Cognitive Architectures for Language Agents — is useful because it does not mystify this. It does not describe one giant memory, but several memory types with different roles. That is a much more useful engineering lens than the oversimplified idea that you can “put a vector database behind it” and call it done.

Where most teams get it wrong

When a team starts building an AI agent, the first focus is usually the model and the prompts. They refine the instructions, add a few tools, connect some kind of knowledge base, and expect an autonomous digital colleague to emerge.

But if you try to cram everything into the same context window, you do not get an agent. You get a fragile demo-level prototype.

For enterprise use, the better question is: what kind of memory do we give the agent, for what purpose, and under what rules?

In practice, it is useful to distinguish four types of agent memory and use them deliberately in different situations.

1. Working memory: what the agent is thinking about right now

Working memory is the agent’s current workspace. It contains the active conversation, the current task, fresh instructions, open files, and everything the agent needs at that moment.

This is closest to what most people know as the context window. It is fast and directly accessible, but temporary. When the session ends, this memory disappears, or at least it is no longer available in the same way.

This is where many teams confuse a larger context window with better memory. A larger window is just a larger workbench. It does not replace structured, durable knowledge. In fact, if you put too much into it, the agent does not become smarter. It becomes more scattered. It prioritizes worse, recalls worse, and loses focus more easily.

Every chatbot has working memory. That alone does not make it an agent.

2. Semantic memory: what it knows about the world and the project

Semantic memory is the durable knowledge layer. It includes rules, facts, definitions, conventions, project knowledge, and documentation. This is the memory that tells the agent not what happened a moment ago, but what is generally true.

In theory, people often imagine this as knowledge graphs, vector databases, or sophisticated RAG (retrieval-augmented generation) pipelines. In practice, many effective systems are much more mundane. A few well-maintained Markdown files can be more valuable than an impressive but noisy memory layer.

At other times, the more structured solution is exactly what you need. If the system has to handle a lot of changing, searchable knowledge, then vector storage and retrieval, or knowledge graphs, can be fully justified.

The point is not to apply the fashionable technology of the moment. The point is whether the agent can reliably retrieve the relevant knowledge at the right time.

Without semantic memory, the agent feels like a new person every time. It may sound convincing, but it will repeat the same mistakes again and again.

3. Procedural memory: how it works

Semantic memory tells the agent what it needs to know. Procedural memory tells it how to do the work.

This includes skills, workflow descriptions, checklists, and step-by-step procedures. A good agent does not only read documentation. It also has executable work patterns. For example:

how to reproduce a bug;
how to review a PR;
how to write release notes;
how to turn a raw research note into something publishable.

This layer is critical because many teams rely too much on the model’s general intelligence. They assume that if the LLM is strong enough, it will figure out the right workflow on its own.

Sometimes it does. Consistently, almost never.

Better systems give the agent skills. Not all at once, because that would overload working memory for no reason. First, the agent sees only a lightweight index of available skills. It loads the detailed instructions only when the task actually calls for them.

4. Episodic memory: what it can actually learn from

Episodic memory is about past cases. It does not store general rules, but concrete experiences: what happened, what decision was made, what worked, what failed, and what should be done differently next time.

This is the layer that makes an agent feel less amnesiac.

The naive solution would be to save everything: full chats, full logs, every intermediate step. That becomes useless very quickly. Raw history is not the same as useful memory.

The better approach is distillation. You do not preserve the entire 45-minute debugging session. You preserve the recurring lesson. For example, that in a given project authentication bugs regularly appeared in the middleware layer. Or that a stakeholder consistently means something different by “done” than the delivery team does. Definition of done — sound familiar?

And this is where it gets hard: what matters is not only what the agent stores, but also when it retrieves it and when it forgets it. Without selection and decay, the agent will not learn. It will accumulate.

Memory is not a database. It is a decision system.

Many explanations turn memory into a technology question too quickly. SQL, vector databases, graphs, RAG. As if the only problem were where to store the data.

But storage is not the point. Good decisions are.

What is worth preserving? What counts as durable knowledge, and what is just noise from the current session? What should always be close at hand, and what should appear only when it is truly relevant? What should be forgotten after a while?

These are not database questions. They are product and systems-design questions, with hard financial consequences. For companies, the real question is whether that money shows up as profit or loss.

Not every agent needs the same memory

Not every use case needs all four memory types at the same depth.

A simple customer support agent that runs through well-defined processes can often work very well with working memory and procedural memory. It does not need a rich episodic layer if there is no real need to learn across multiple sessions.

A coding agent or a complex internal operations agent is different. There, the agent needs to know:

the project rules;
which workflow to follow;
what it learned from previous mistakes;
what current context it has to act within.

In that case, all four memory types are a competitive advantage, not an extra.

The useful question is not whether you have a `skill.md`, but what your agent remembers

The noise around the term “AI agent” is partly there because too many people use it as a behavioral label. If the system appears autonomous, they call it an agent. That is convenient, but it does not tell system designers much.

If we can answer what the AI agent must preserve, when it should retrieve it, and how that changes its next decision, then we are taking the first steps in the right direction.

If we cannot answer those questions, we are probably not building an agent. We are building a more capable chat interface.

The most useful AI systems do not improve because someone adds one more line to the prompt. They improve because the system clearly separates the current context, durable knowledge, execution routines, and the experience worth keeping.

That is real systems design, not prompt magic — which is becoming less relevant anyway.

Most teams are still choosing models. The best product teams are already designing memory architecture.

More writing from the archive

Browse all writing

blog7 min

Open Knowledge Format: A Shared Language for AI Agents

OKF shows why stronger models are not enough: AI agents need shared, portable context that humans can still read.

blog8 min

WebMCP and the New Language of the Invisible Web

WebMCP means digital products must be designed not only for humans, but also for AI agents that need to read and act on them.

Cross-reference

Projects connected to this thinking

Browse projects

prototypeai

Open Brain: Building a Personal Knowledge Backend with AI

Open Brain: Building a Personal Knowledge Backend with AI What if your notes could think? Not in a sci fi way — but in a practical, "I wrote something three months ago th…

case-studyfintech

Raiffeisen Bank: End-to-End Online Account Opening

Raiffeisen Bank: End to End Online Account Opening When Raiffeisen Bank decided to let customers open a bank account entirely online — no branch visit required — they kne…

Archive reference: AI-

Explore projects Start a conversation

Back to archiveMay 27, 2026

Insight archive

More Than a Good Prompt: The 4 Memory Types of AI Agents

What makes an AI agent useful at enterprise level is not a long prompt or tool access. It is how the system handles working memory, durable knowledge, procedures, and past experience.

Categoryblog

Read time

6 min

More Than a Good Prompt: The 4 Memory Types of AI Agents

That does not make it a real enterprise-grade AI agent. Simple AI agents are easy to create without coding; I wrote about that in this article.

In short: the prompt does not make the AI agent. The memory architecture does.

Where most teams get it wrong

But if you try to cram everything into the same context window, you do not get an agent. You get a fragile demo-level prototype.

For enterprise use, the better question is: what kind of memory do we give the agent, for what purpose, and under what rules?

In practice, it is useful to distinguish four types of agent memory and use them deliberately in different situations.

1. Working memory: what the agent is thinking about right now

Working memory is the agent’s current workspace. It contains the active conversation, the current task, fresh instructions, open files, and everything the agent needs at that moment.

Every chatbot has working memory. That alone does not make it an agent.

2. Semantic memory: what it knows about the world and the project

The point is not to apply the fashionable technology of the moment. The point is whether the agent can reliably retrieve the relevant knowledge at the right time.

Without semantic memory, the agent feels like a new person every time. It may sound convincing, but it will repeat the same mistakes again and again.

3. Procedural memory: how it works

Semantic memory tells the agent what it needs to know. Procedural memory tells it how to do the work.

This includes skills, workflow descriptions, checklists, and step-by-step procedures. A good agent does not only read documentation. It also has executable work patterns. For example:

how to reproduce a bug;
how to review a PR;
how to write release notes;
how to turn a raw research note into something publishable.

This layer is critical because many teams rely too much on the model’s general intelligence. They assume that if the LLM is strong enough, it will figure out the right workflow on its own.

Sometimes it does. Consistently, almost never.

4. Episodic memory: what it can actually learn from

This is the layer that makes an agent feel less amnesiac.

The naive solution would be to save everything: full chats, full logs, every intermediate step. That becomes useless very quickly. Raw history is not the same as useful memory.

Memory is not a database. It is a decision system.

Many explanations turn memory into a technology question too quickly. SQL, vector databases, graphs, RAG. As if the only problem were where to store the data.

But storage is not the point. Good decisions are.

These are not database questions. They are product and systems-design questions, with hard financial consequences. For companies, the real question is whether that money shows up as profit or loss.

Not every agent needs the same memory

Not every use case needs all four memory types at the same depth.

A coding agent or a complex internal operations agent is different. There, the agent needs to know:

the project rules;
which workflow to follow;
what it learned from previous mistakes;
what current context it has to act within.

In that case, all four memory types are a competitive advantage, not an extra.

The useful question is not whether you have a `skill.md`, but what your agent remembers

If we can answer what the AI agent must preserve, when it should retrieve it, and how that changes its next decision, then we are taking the first steps in the right direction.

If we cannot answer those questions, we are probably not building an agent. We are building a more capable chat interface.

That is real systems design, not prompt magic — which is becoming less relevant anyway.

Most teams are still choosing models. The best product teams are already designing memory architecture.

More writing from the archive

Browse all writing

blog7 min

Open Knowledge Format: A Shared Language for AI Agents

OKF shows why stronger models are not enough: AI agents need shared, portable context that humans can still read.

blog8 min

WebMCP and the New Language of the Invisible Web

WebMCP means digital products must be designed not only for humans, but also for AI agents that need to read and act on them.

Cross-reference

Projects connected to this thinking

Browse projects

prototypeai

Open Brain: Building a Personal Knowledge Backend with AI

Open Brain: Building a Personal Knowledge Backend with AI What if your notes could think? Not in a sci fi way — but in a practical, "I wrote something three months ago th…

case-studyfintech

Raiffeisen Bank: End-to-End Online Account Opening

Raiffeisen Bank: End to End Online Account Opening When Raiffeisen Bank decided to let customers open a bank account entirely online — no branch visit required — they kne…

Archive reference: AI-

Explore projects Start a conversation

More Than a Good Prompt: The 4 Memory Types of AI Agents

More Than a Good Prompt: The 4 Memory Types of AI Agents

Where most teams get it wrong

1. Working memory: what the agent is thinking about right now

2. Semantic memory: what it knows about the world and the project

3. Procedural memory: how it works

4. Episodic memory: what it can actually learn from

Memory is not a database. It is a decision system.

Not every agent needs the same memory

The useful question is not whether you have a skill.md, but what your agent remembers

More writing from the archive

Projects connected to this thinking

More Than a Good Prompt: The 4 Memory Types of AI Agents

More Than a Good Prompt: The 4 Memory Types of AI Agents

Where most teams get it wrong

1. Working memory: what the agent is thinking about right now

2. Semantic memory: what it knows about the world and the project

3. Procedural memory: how it works

4. Episodic memory: what it can actually learn from

Memory is not a database. It is a decision system.

Not every agent needs the same memory

The useful question is not whether you have a skill.md, but what your agent remembers

More writing from the archive

Projects connected to this thinking

The useful question is not whether you have a `skill.md`, but what your agent remembers

The useful question is not whether you have a `skill.md`, but what your agent remembers