Most AI agents have the memory of a goldfish. Each session starts from scratch, and whatever the agent figured out yesterday — the quirk in a client's invoices, the formatting a reviewer always asks for, the dead end it keeps walking into — is gone by the next run. You can paste context back in, but that puts the burden on you, and it doesn't scale to an agent that runs hundreds of times a week.
Anthropic's answer is a feature with a deliberately evocative name: Dreaming. Announced on May 6, 2026 at the Code with Claude developer conference in San Francisco, it lets a Claude Managed Agent review its own past work between sessions and quietly improve the memory it carries forward. The pitch is an agent that gets better at its job the more it does it, without you hand-tuning prompts after every run.
What Dreaming actually is
Dreaming is a scheduled, between-sessions process that consolidates an agent's memory. When it runs, the agent reads two things: its existing memory store, and a batch of transcripts from recent sessions. It then produces a reorganized version of that memory — merging duplicate notes, resolving contradictions in favor of the most recent information, correcting things it previously got wrong, and pruning entries that are stale or no longer relevant.
The name comes straight from Anthropic's own framing. They compare it to hippocampal memory consolidation — the way a human brain replays the day's events during sleep, strengthens what matters, and discards the noise. The agent isn't "running" tasks while it dreams; it's reflecting on tasks it already ran and deciding what's worth keeping.
Crucially, this is a memory operation, not a model operation. Dreaming does not retrain or fine-tune Claude, and it does not modify your original session data. The raw transcripts remain as an untouched record. What changes is the curated memory layer — the structured notes the agent loads at the start of its next session.
Why it matters
The problem Dreaming solves is the gap between a single smart session and a system that actually accumulates expertise. An agent without memory is only ever as good as its prompt. An agent with a naive memory log eventually drowns in redundant, contradictory, and outdated notes — the store grows, signal drops, and retrieval gets worse over time.
Dreaming targets that decay directly. By periodically rewriting the store instead of just appending to it, the memory stays compact and high-signal as it evolves. The kinds of insight it's designed to surface are exactly the ones a human operator would notice over time:
- Recurring mistakes the agent makes, so it can stop repeating them.
- Workflows the agent converges on — the approach that keeps working for a given task type.
- Preferences shared across a team, so one operator's correction benefits everyone.
The early signal is real. Legal-AI company Harvey, an Anthropic enterprise customer, ran Dreaming before the public launch and reported roughly a 6x increase in task completion rates in internal testing once it was turned on. Treat any single vendor number with appropriate skepticism, but the direction matches the design intent: an agent that learns across sessions beats one that resets every time.
Where and who can use it
Dreaming launched as a research preview for Claude Managed Agents on the Claude platform, introduced by CPO Ami Vora at Code with Claude. At launch, access was gated — developers request it rather than flipping a switch — while companion features like outcome grading and multiagent orchestration went to broader beta. If you're evaluating it, plan around preview-level availability and the access-request step rather than assuming it's on by default.
It's worth being clear about who benefits, because Dreaming is not a universal upgrade. It pays off when an agent runs the same category of task repeatedly, where patterns actually recur:
| Good fit | Weak fit |
|---|---|
| Document-review pipelines | One-off research questions |
| Customer-support bots | Ad-hoc, unrelated requests |
| Code-review systems | Short-lived, single-session tasks |
| Content-generation workflows | Tasks with no repeatable structure |
If your agent handles a stream of unrelated, single-shot requests, there's little to consolidate, and Dreaming has nothing to latch onto. The value scales with repetition.
How the consolidation cycle works
Mechanically, a dream runs asynchronously, when the agent isn't busy, on a schedule you set. A high-volume support agent might consolidate every few hours; a research agent might run weekly. Each cycle moves through roughly four phases:
- Review — read the existing memory store plus a batch of recent session transcripts. (Early coverage put the batch at as many as 100 past sessions; treat the exact number as preview-dependent.)
- Extract — identify the recurring patterns, facts, preferences, and outcomes worth retaining.
- Reconcile — merge duplicates, resolve contradictions in favor of the most recent value, and correct previously stored beliefs that turned out wrong.
- Prune — drop entries that are stale or superseded, keeping the store compact.
The output is a rewritten memory store, ready for the next session to load.
The configuration knob that matters most is how changes land. Dreaming can update memory automatically, or you can require human review before any changes take effect. For high-stakes work, the review gate is the safer default — it puts a person between a questionable consolidation and every future run that would inherit it. This is the same instinct behind giving agents scheduled, bounded routines rather than fully open-ended autonomy.
The caveat worth taking seriously
Self-improving memory cuts both ways, and the failure mode is straightforward: a bad memory persists and compounds. If the agent misinterprets a session and writes a faulty note, that error doesn't just affect one run — it gets consolidated into the store and silently shapes every session afterward. Poorly tuned curation can amplify mistakes rather than correct them.
There's a security dimension too. A structured, persistent memory store expands the surface for prompt injection and memory poisoning. A malicious input in one session could plant instructions that the agent dutifully consolidates and then acts on automatically in later sessions, well after the original attacker is gone. This isn't a reason to avoid the feature, but it is a reason to treat the memory store as a sensitive, auditable asset.
The practical mitigations line up with the controls Dreaming already offers:
- Use the human-review gate for anything consequential, at least until you trust the consolidations.
- Pair it with outcome grading. An independent outcomes rubric that scores results is one of the better ways to catch memory drift before it spreads.
- Audit the store periodically. Treat the memory layer like code: review what's in it, and be able to roll back a bad consolidation.
- Keep raw transcripts as ground truth. Because Dreaming leaves original session data untouched, you always have a clean record to reconcile against when a memory looks wrong.
If you're building this kind of agent from the ground up, the same principles apply whether you're using Managed Agents or rolling your own with the Claude Agent SDK — durable memory is only an asset if you can inspect and correct it.
The bottom line
Dreaming is one of the more conceptually interesting agent features Anthropic has shipped, because it goes after a real ceiling: agents that can't learn from their own history. By consolidating memory between sessions — merging, reconciling, and pruning the way a brain does during sleep — a Claude Managed Agent can get measurably better at a repeated task without constant hand-holding.
The honest framing is that it's a research preview with a sharp edge. The upside is an agent that compounds its competence; the downside is an agent that can compound its mistakes just as efficiently. Used on the right workloads — repeated, structured tasks — and paired with human review and outcome grading, it's a genuine step toward agents that improve on the job. Just don't hand it a memory you can't audit.