Skip to main content
Home/Blog/Claude's "Dreaming" Explained: Self-Improving Memory for Managed Agents
Artificial Intelligence

Claude's "Dreaming" Explained: Self-Improving Memory for Managed Agents

Anthropic's Dreaming feature lets Claude Managed Agents consolidate their own memory between sessions, the way a brain replays the day during sleep. Here's what it does, who can use it, and where it helps.

By Sean

Most AI agents have the memory of a goldfish. Each session starts from scratch, and whatever the agent figured out yesterday — the quirk in a client's invoices, the formatting a reviewer always asks for, the dead end it keeps walking into — is gone by the next run. You can paste context back in, but that puts the burden on you, and it doesn't scale to an agent that runs hundreds of times a week.

Anthropic's answer is a feature with a deliberately evocative name: Dreaming. Announced on May 6, 2026 at the Code with Claude developer conference in San Francisco, it lets a Claude Managed Agent review its own past work between sessions and quietly improve the memory it carries forward. The pitch is an agent that gets better at its job the more it does it, without you hand-tuning prompts after every run.

What Dreaming actually is

Dreaming is a scheduled, between-sessions process that consolidates an agent's memory. When it runs, the agent reads two things: its existing memory store, and a batch of transcripts from recent sessions. It then produces a reorganized version of that memory — merging duplicate notes, resolving contradictions in favor of the most recent information, correcting things it previously got wrong, and pruning entries that are stale or no longer relevant.

The name comes straight from Anthropic's own framing. They compare it to hippocampal memory consolidation — the way a human brain replays the day's events during sleep, strengthens what matters, and discards the noise. The agent isn't "running" tasks while it dreams; it's reflecting on tasks it already ran and deciding what's worth keeping.

Crucially, this is a memory operation, not a model operation. Dreaming does not retrain or fine-tune Claude, and it does not modify your original session data. The raw transcripts remain as an untouched record. What changes is the curated memory layer — the structured notes the agent loads at the start of its next session.

Why it matters

The problem Dreaming solves is the gap between a single smart session and a system that actually accumulates expertise. An agent without memory is only ever as good as its prompt. An agent with a naive memory log eventually drowns in redundant, contradictory, and outdated notes — the store grows, signal drops, and retrieval gets worse over time.

Dreaming targets that decay directly. By periodically rewriting the store instead of just appending to it, the memory stays compact and high-signal as it evolves. The kinds of insight it's designed to surface are exactly the ones a human operator would notice over time:

  • Recurring mistakes the agent makes, so it can stop repeating them.
  • Workflows the agent converges on — the approach that keeps working for a given task type.
  • Preferences shared across a team, so one operator's correction benefits everyone.

The early signal is real. Legal-AI company Harvey, an Anthropic enterprise customer, ran Dreaming before the public launch and reported roughly a 6x increase in task completion rates in internal testing once it was turned on. Treat any single vendor number with appropriate skepticism, but the direction matches the design intent: an agent that learns across sessions beats one that resets every time.

Where and who can use it

Dreaming launched as a research preview for Claude Managed Agents on the Claude platform, introduced by CPO Ami Vora at Code with Claude. At launch, access was gated — developers request it rather than flipping a switch — while companion features like outcome grading and multiagent orchestration went to broader beta. If you're evaluating it, plan around preview-level availability and the access-request step rather than assuming it's on by default.

It's worth being clear about who benefits, because Dreaming is not a universal upgrade. It pays off when an agent runs the same category of task repeatedly, where patterns actually recur:

Good fitWeak fit
Document-review pipelinesOne-off research questions
Customer-support botsAd-hoc, unrelated requests
Code-review systemsShort-lived, single-session tasks
Content-generation workflowsTasks with no repeatable structure

If your agent handles a stream of unrelated, single-shot requests, there's little to consolidate, and Dreaming has nothing to latch onto. The value scales with repetition.

How the consolidation cycle works

Mechanically, a dream runs asynchronously, when the agent isn't busy, on a schedule you set. A high-volume support agent might consolidate every few hours; a research agent might run weekly. Each cycle moves through roughly four phases:

  1. Review — read the existing memory store plus a batch of recent session transcripts. (Early coverage put the batch at as many as 100 past sessions; treat the exact number as preview-dependent.)
  2. Extract — identify the recurring patterns, facts, preferences, and outcomes worth retaining.
  3. Reconcile — merge duplicates, resolve contradictions in favor of the most recent value, and correct previously stored beliefs that turned out wrong.
  4. Prune — drop entries that are stale or superseded, keeping the store compact.

The output is a rewritten memory store, ready for the next session to load.

The configuration knob that matters most is how changes land. Dreaming can update memory automatically, or you can require human review before any changes take effect. For high-stakes work, the review gate is the safer default — it puts a person between a questionable consolidation and every future run that would inherit it. This is the same instinct behind giving agents scheduled, bounded routines rather than fully open-ended autonomy.

The caveat worth taking seriously

Self-improving memory cuts both ways, and the failure mode is straightforward: a bad memory persists and compounds. If the agent misinterprets a session and writes a faulty note, that error doesn't just affect one run — it gets consolidated into the store and silently shapes every session afterward. Poorly tuned curation can amplify mistakes rather than correct them.

There's a security dimension too. A structured, persistent memory store expands the surface for prompt injection and memory poisoning. A malicious input in one session could plant instructions that the agent dutifully consolidates and then acts on automatically in later sessions, well after the original attacker is gone. This isn't a reason to avoid the feature, but it is a reason to treat the memory store as a sensitive, auditable asset.

The practical mitigations line up with the controls Dreaming already offers:

  • Use the human-review gate for anything consequential, at least until you trust the consolidations.
  • Pair it with outcome grading. An independent outcomes rubric that scores results is one of the better ways to catch memory drift before it spreads.
  • Audit the store periodically. Treat the memory layer like code: review what's in it, and be able to roll back a bad consolidation.
  • Keep raw transcripts as ground truth. Because Dreaming leaves original session data untouched, you always have a clean record to reconcile against when a memory looks wrong.

If you're building this kind of agent from the ground up, the same principles apply whether you're using Managed Agents or rolling your own with the Claude Agent SDK — durable memory is only an asset if you can inspect and correct it.

The bottom line

Dreaming is one of the more conceptually interesting agent features Anthropic has shipped, because it goes after a real ceiling: agents that can't learn from their own history. By consolidating memory between sessions — merging, reconciling, and pruning the way a brain does during sleep — a Claude Managed Agent can get measurably better at a repeated task without constant hand-holding.

The honest framing is that it's a research preview with a sharp edge. The upside is an agent that compounds its competence; the downside is an agent that can compound its mistakes just as efficiently. Used on the right workloads — repeated, structured tasks — and paired with human review and outcome grading, it's a genuine step toward agents that improve on the job. Just don't hand it a memory you can't audit.

Frequently Asked Questions

Find answers to common questions

Dreaming is a scheduled process for Claude Managed Agents that runs between sessions. It reviews past conversation transcripts and the agent's existing memory store, finds recurring patterns, and rewrites the memory so it stays accurate and high-signal. Anthropic compares it to hippocampal memory consolidation, the way a human brain replays the day's events during sleep to decide what to keep.

Anthropic introduced Dreaming on May 6, 2026 at its Code with Claude developer conference in San Francisco. It launched as a research preview for Claude Managed Agents, alongside related features for outcome grading and multiagent orchestration.

No. Dreaming does not retrain or fine-tune the model and does not modify your original session data. It only rewrites the agent's curated memory layer, the notes the agent reads at the start of future sessions. The raw transcripts stay intact as a separate record.

It is most valuable for agents that run the same category of task repeatedly over many sessions, such as document-review pipelines, customer-support bots, code-review systems, and content generation. Agents that handle one-off, unrelated requests gain little, because there are few recurring patterns to consolidate.

Yes. Dreaming can update memory automatically, or you can require human review before any changes land. The review option is the recommended setting for high-stakes workflows, because it lets you catch a bad consolidation before it influences future runs.

The main risk is that a faulty or poisoned memory persists and compounds across sessions. If an agent writes an incorrect conclusion, or a malicious input plants instructions in the memory store, every later session can inherit it. Pairing Dreaming with human review and outcome grading helps catch this drift early.

Let's turn this knowledge into action

Our experts can help you apply these insights to your specific situation. No sales pitch — just a technical conversation.