Anthropic 'Dreaming': Self-Improving Claude Agents Explained

What Dreaming Actually Is
The Three Patterns It Looks For
What It Is Not
The Customer Results
How to Actually Turn It On
Why the Boring Version Is the Useful Version
FAQ

“AI agents that dream and learn from their mistakes while you sleep” is a real headline that ran this month. It is also accurate. Anthropic shipped exactly that on May 6 at its Code with Claude conference. The phrase pattern-matches to every recursive-self-improvement fear in the AI safety discourse — autonomous systems rewriting themselves, capability gains nobody authorized, the runaway loop.

The actual mechanism is so mundane it loops back around to being genuinely useful. No model weights change. No capabilities are added. The agent does not modify itself in any sense that should worry anyone. What “dreaming” actually does is automate the thing every competent professional does on a Friday afternoon: review the week’s work, notice what kept going wrong, and write better notes for next time.

That is the whole feature. And it is one of the more important agent releases of the year, precisely because it is that boring.

What Dreaming Actually Is

Dreaming is a scheduled background process available in research preview for Claude Managed Agents. When a developer enables it, Anthropic schedules a recurring job that runs between the agent’s actual work sessions. The job reads through the agent’s recent sessions and its memory store, identifies patterns, and rewrites the memory store — condensing what has gone stale, promoting what has become load-bearing.

The output is not a new model. It is plain text. The agent writes its learnings as notes and structured “playbooks” that future sessions reference the same way they would reference any other context. If you opened the memory store after a dreaming cycle, you would see readable English: “when processing contract redlines, the client always wants the summary table first” or “the PDF parser fails silently on scanned documents older than 2015 — check page count against expected length.”

This matters because it locates the entire mechanism in the context window, not the weights. Everything dreaming does is legible, auditable, and editable by a human. You can read what your agent learned. You can delete a learning that is wrong. You can edit a playbook by hand. None of that is true of a system that improves by changing model weights.

The Three Patterns It Looks For

The dreaming process scans for exactly three kinds of pattern, and the specificity is the point.

Recurring mistakes. Errors the agent keeps making across different jobs. If an agent repeatedly mishandles the same edge case, dreaming notices the repetition and writes an explicit note telling future sessions to handle it differently. This is the single highest-value pattern type because recurring errors are the main reason agents fail to ship into production.

Converged workflows. Sequences of steps the agent independently arrives at across many different jobs. When an agent keeps solving a class of problem the same way, dreaming codifies that sequence into a reusable playbook so future sessions start from the proven approach instead of rediscovering it.

Emergent team preferences. Preferences that show up across a fleet of agents working for the same team. If multiple agents independently learn that a given team wants outputs in a specific format, dreaming promotes that into a shared learning rather than leaving each agent to discover it separately.

Notice what is not on this list: goals, values, capabilities, or anything the agent was not already doing. Dreaming makes the agent more consistent at its existing job. It does not change what the job is.

What It Is Not

Because the word “dreaming” and the phrase “self-improving agents” do a lot of unhelpful work, the negative space is worth stating plainly.

It is not recursive self-improvement in the AI-safety sense. The agent’s capabilities are fixed by the underlying model. Dreaming cannot make the agent smarter, only more consistent. The ceiling is wherever Claude Opus 4.7 or whichever model powers the agent already sits.

It is not autonomous goal modification. The agent does not decide what it wants. It reviews how well it did the thing it was already told to do.

It is not opaque. Every learning is plain text in a memory store a human can read. This is the opposite of the black-box self-modification that the term “dreaming” might suggest. If anything, dreaming makes agents more inspectable, because the agent’s accumulated knowledge is now written down instead of implicit in a thousand past sessions nobody will reread.

It is not available to standard API users. Dreaming is a Claude Managed Agents feature. Developers using the Messages API directly do not get it. This is a managed-platform capability, which is itself a signal about where Anthropic thinks the durable agent business is.

The Customer Results

Two early customers reported numbers, and the numbers are large enough to take seriously.

Harvey, the legal AI company, reported that task completion rates increased roughly 6x after implementing dreaming. A 6x improvement is not a tuning gain. It is the difference between an agent that mostly fails and an agent that mostly works. The most plausible read: Harvey’s agents were hitting the same recurring legal-document edge cases over and over, and dreaming let the fleet write those down once instead of failing on them indefinitely.

Wisedocs, a medical document review company, cut document review time by 50% using outcomes-based evaluation alongside dreaming. Half the time on a document-review workflow is a margin change that shows up directly in unit economics.

Both results share a shape. These are companies running fleets of agents against a bounded, repetitive domain with clear right answers. That is exactly the environment where “notice the recurring mistake, write it down, stop making it” compounds fastest. Dreaming is unlikely to deliver 6x on open-ended creative work. It is very likely to deliver large gains on repetitive, high-volume, verifiable agent workflows.

How to Actually Turn It On

Dreaming ships as one of three new Claude Managed Agents features announced May 6, alongside sub-agent coordination and rubric-based outcome evaluation. To use it:

Run your agent as a Claude Managed Agent rather than through the raw Messages API. Enable dreaming in the agent configuration. Set the schedule for how often the dreaming process runs — for a high-volume agent, nightly makes sense; for a lower-volume one, weekly is enough. Make sure your agent has a memory store configured, because dreaming operates on it.

Do this first: pick one agent that runs a high-volume, repetitive workflow and has been failing at a consistent rate you have not been able to tune away. Enable dreaming on that agent with a nightly schedule. Run it for two weeks. Then open the memory store and read what it wrote. The learnings it surfaced are a free audit of where your agent was quietly failing — useful even if you decide to encode the fixes manually instead.

Skip it if your agents do low-volume, high-variance work where no two jobs resemble each other. Dreaming needs repetition to find patterns. An agent that does something different every time gives it nothing to converge on. For that profile, the multi-agent debate approaches or stronger base models are the better lever.

Why the Boring Version Is the Useful Version

The reason dreaming matters is not that it is exciting. It is that it solves the actual bottleneck for production agents.

The thing keeping AI agents out of production in 2026 is not raw capability. The base models are good enough. The blocker is consistency — agents that work 85% of the time and fail the same way the other 15% without anyone able to make the failures stop. Every team that has tried to ship an agent has hit this wall. The fixes have been manual: a human notices the failure pattern, writes a better prompt, redeploys, repeats.

Dreaming automates that loop. It is the agent doing its own post-incident review and updating its own runbook. That is not a capability gain. It is a reliability gain, and reliability is the thing standing between “impressive demo” and “deployed system.” The agent stack has been waiting for exactly this layer.

The boring framing is the correct framing. “Self-improving AI” sells headlines. “Automated memory curation that makes your agents stop repeating mistakes” sells deployments. Anthropic shipped the second thing and let the press write the first.

FAQ

Does dreaming change the Claude model?
No. Dreaming does not modify model weights in any way. It rewrites the agent’s plain-text memory store and playbooks. The underlying model is exactly the same before and after a dreaming cycle. All gains come from the agent having better-organized context, not a better model.

Can I see what my agent learned during dreaming?
Yes. The entire output is plain-text notes and structured playbooks in the agent’s memory store. You can read every learning, edit any of them by hand, and delete ones that are wrong. This is one of the main advantages over weight-based approaches — the learning is fully legible and auditable.

Is dreaming available through the standard Claude API?
No. Dreaming is a Claude Managed Agents feature in research preview as of May 6, 2026. Developers using the Messages API directly do not have access. You need to run your agent on the Managed Agents platform.

How often does the dreaming process run?
It is scheduled, and the developer sets the cadence. High-volume agents benefit from a nightly schedule; lower-volume agents can run it weekly. The process runs between the agent’s actual work sessions so it does not add latency to live jobs.

What kind of agent benefits most from dreaming?
Agents running high-volume, repetitive workflows in bounded domains with verifiable right answers — legal document processing, medical record review, structured data extraction. These environments give dreaming the repetition it needs to find patterns. Low-volume, high-variance agents see little benefit because there is nothing for the process to converge on.

How does this relate to the other May 6 Managed Agents features?
Dreaming launched alongside sub-agent coordination (agents directing other agents) and rubric-based outcome evaluation (scoring agent work against defined criteria). The three are complementary: rubric-based evaluation gives dreaming cleaner signal about what counts as a mistake, and sub-agent coordination gives dreaming more sessions to learn from. Together they form Anthropic’s current answer to production agent reliability.

Anthropic’s ‘Dreaming’ Is Self-Improving AI — Minus the Part Everyone Worries About

Table of Contents