Skip to main content
Home/Blog/Claude Code Ultracode: The Workflow-First, xHigh-Effort Mode Explained
Developer Tools

Claude Code Ultracode: The Workflow-First, xHigh-Effort Mode Explained

Ultracode is Claude Code's setting that pairs xhigh reasoning with automatic dynamic-workflow orchestration, so Claude plans and runs multi-agent workflows for every substantive task. Here's what it actually does, when it's worth the tokens, and how to turn it on.

By Sean

Most AI coding sessions are a conversation: you ask, Claude does a turn, you steer, repeat. That's a great default. But some tasks aren't shaped like a conversation at all. A codebase-wide bug sweep, a 500-file migration, or a research question whose sources all need cross-checking don't want one assistant taking turns. They want a plan, run at scale, with the results reconciled before anything comes back to you.

That's what ultracode is for. It's the orchestration-by-default setting in Claude Code — xhigh reasoning paired with automatic multi-agent workflows — and once you understand what it actually does under the hood, it changes which problems you're willing to throw at the tool.

What ultracode actually is

Here's the precise definition, straight from the docs:

Ultracode is a Claude Code setting rather than a model effort level: it sends xhigh to the model and additionally has Claude orchestrate dynamic workflows for substantive tasks. It applies to the current session only.

Two things are worth pinning down immediately.

First, there is no effort level called "ultra." The actual ladder is low, medium, high, xhigh, max. Ultracode sits on top of that ladder as a setting: it pushes the model to xhigh reasoning and adds automatic workflow orchestration. Because it depends on xhigh, it only shows up in the /effort menu on models that support that level, currently Fable 5, Opus 4.8, and Opus 4.7. (Opus 4.6 and Sonnet 4.6 top out at high/max with no xhigh, so no ultracode there.)

Second, "orchestrate dynamic workflows" is the load-bearing phrase. With ultracode on, Claude doesn't wait for you to ask for parallelism:

With it on, Claude plans a workflow for each substantive task instead of waiting for you to ask.

A single request can even fan into several workflows in a row: one to understand the code, one to make the change, one to verify it.

When and why to reach for it

Ultracode earns its keep when a task outgrows what one conversation can coordinate, or when you want the orchestration codified as a re-runnable script. The canonical examples from the docs:

  • A codebase-wide bug sweep across hundreds of files.
  • A large migration (think 500 files) where the same transform applies everywhere.
  • A research question whose sources have to be cross-checked against each other.
  • A hard plan worth drafting from several independent angles before you commit.

The "why" is more interesting than just "it runs more agents." Workflows let Claude apply repeatable quality patterns a single pass can't. Independent agents can adversarially review each other's findings before they're reported. A plan can be drafted from multiple angles and weighed. The result isn't just faster, it's more trustworthy, because something checked it.

A useful framing from secondary coverage: ultracode earns its cost when the price of missing something exceeds the compute. A security audit where one missed finding is expensive is a perfect fit. Renaming a variable is not.

How it works under the hood

This is where dynamic workflows get genuinely clever. A dynamic workflow is a JavaScript program Claude writes on the fly to coordinate subagents. The runtime executes that script in the background while your chat session stays responsive.

The key architectural move:

A workflow moves the plan into code. With subagents, skills, and agent teams, Claude is the orchestrator... and every result lands in a context window. A workflow script holds the loop, the branching, and the intermediate results itself, so Claude's context holds only the final answer.

That's the whole trick. In a normal turn-by-turn agent pattern, every subagent result flows back into Claude's context window, so you hit the context ceiling fast. In a workflow, the loop, the branching, and all the intermediate results live in script variables outside the context window. Only the final answer returns to the conversation. That's why a workflow can coordinate dozens or hundreds of agents without running out of room.

And the economics are nicer than you'd expect: the coordinating JavaScript spends zero model tokens, because it's plain deterministic code the runtime executes, not the model. Tokens are spent inside the agent() calls, the workers that actually do the reading, writing, and reasoning.

A few practical mechanics worth knowing:

  • Subagents are the worker primitive. Each runs with its own context window and a focused, isolated goal, which mitigates the "agentic laziness" and goal drift that creep in the longer Claude works in one long context.
  • The script itself has no filesystem or shell access. Agents do the reading, writing, and running; the script only coordinates them.
  • Limits: up to 16 concurrent agents (fewer on machines with limited CPU cores), and a lifetime cap of 1,000 agents total per run as a runaway backstop.
  • Runs are resumable within the same session (completed agents return cached results), and every run's script is written to a file under ~/.claude/projects/ so you can read, diff, edit, or relaunch it. Exit Claude Code, though, and the workflow restarts fresh next session.

How to turn it on

There are exactly two activation paths, and the difference matters — especially for your token bill.

Per-command (one task only) — the one you'll usually want. Just include the keyword ultracode anywhere in a single prompt:

ultracode: audit every API route in src/ for missing auth checks

That runs only that one command as a workflow and then goes straight back to normal. It does not change your session's effort level, so the very next message is an ordinary turn again — no need to remember to switch anything back. Asking in plain English, "use a workflow" or "run a workflow," does the same thing. (Pre-v2.1.160 the keyword was workflow; natural language works in both.)

This per-command form is the safer default, and it's the main lever you have for controlling cost. You opt into the expensive, many-agent treatment exactly when a task is worth it, one command at a time, instead of paying for it on every message.

Session-scoped (standing default). If you have a whole session of heavy work, set the effort instead:

/effort ultracode

Now Claude plans a workflow for every substantive task for the rest of the session. It lasts only the current session and resets on a new one. This is powerful but expensive — every request now fans out, so don't leave it on for routine work:

Drop back with /effort high when you return to routine work.

Note that ultracode is genuinely a different beast from its similarly-named cousins. ultrathink asks for deeper reasoning on one turn via an in-context instruction, but it does not change the API effort level and does not trigger a workflow. Ultracode does both.

It's also worth separating ultracode from /goal, the other "keep Claude going" feature. /goal keeps a single session working turn after turn until a completion condition is met — it's about persistence toward an end state. Ultracode is about parallelism: fanning one task out across many agents at once. They solve different problems and even compose — a /goal that keeps re-running until tests pass, with each turn free to spin up a workflow.

To use any of this you need Claude Code v2.1.154+ for dynamic workflows. It's available on paid plans plus the Anthropic API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry; on Pro you enable it from the Dynamic workflows row in /config. (The research preview launched May 28, 2026.)

The patterns it unlocks

Because the plan lives in code, workflows can express orchestration shapes that are awkward to do by hand. The general shape is fan out → reduce → synthesize: parallel agents gather data, deterministic JavaScript dedupes and filters, then final agents synthesize.

The toolkit gives you two core combinators:

  • parallel(thunks) runs tasks concurrently as a barrier: it waits for every task to finish before returning. Failed tasks resolve to null rather than blowing up the whole call.
  • pipeline(items, ...stages) streams items through stages without a barrier between them, so item A can reach stage 3 while item B is still in stage 1. The guidance is to default to pipeline() and treat parallel() barriers as the exception.

Individual agent(prompt, opts) calls can be tuned too: override the model per call (route cheap stages to a smaller model), run in an isolated git worktree to avoid write conflicts during parallel edits, or demand schema-validated JSON output with automatic retries on validation failure.

On top of those primitives, Anthropic documents named patterns: classify-and-act, fan-out-and-synthesize, adversarial verification (independent agents try to refute each other's claims), generate-and-filter, tournament (N agents attempt a task different ways, judges pick a winner), and loop-until-done. Claude Code also ships a bundled /deep-research workflow, and any run can be saved as a reusable /<name> command in .claude/workflows/.

Cautions: cost, latency, and approval

Ultracode is powerful precisely because it doesn't economize, and that's also its main risk: it can burn through tokens fast.

Here's why the bill climbs. A normal turn is one model doing the work. An ultracode workflow spawns many agents — up to 16 running concurrently and as many as 1,000 over a single run — and every one of those agents has its own context window and does real reading, writing, and reasoning, all of it billed. So a single ultracode command can cost many multiples of the same task asked the ordinary way. (The deterministic JavaScript that coordinates the agents is free — it's plain code, not the model — but the agents themselves are where the tokens go, and there can be a lot of them.) For a documented sense of scale, Anthropic notes that agent teams use roughly 7× the tokens of a normal session in plan mode; ultracode has no official published multiplier, but it's firmly in that "expect several times more" territory.

A single request can turn into several workflows in a row... This applies to every task in the session, so each request uses more tokens and takes longer than at lower effort levels.

Runs also count toward your plan's usage and rate limits like any other session — that part is official. Community guides go further and report that a single codebase-wide audit can consume more of your weekly limit than a normal day of work; treat that as a directional warning rather than a measured figure. Either way, the practical rules are:

  • Prefer the per-command ultracode keyword over /effort ultracode so you pay the premium only on the tasks that earn it.
  • Run on a small slice first — one directory or a narrow question — to gauge spend before pointing it at the whole repo.
  • Watch it live. The /workflows view shows per-agent token usage as it runs, and you can stop a run without losing completed work.

Two more things worth knowing:

  • Every agent uses your session model unless the script routes a stage elsewhere, so check /model before a big run and ask Claude to send non-critical stages to a smaller, cheaper model.
  • On permissions: in Default/accept-edits mode you're prompted before every run; in Auto mode the per-run prompt is skipped when ultracode is on; and claude -p/Agent SDK runs start immediately. Subagents inside a workflow always run in acceptEdits mode and inherit your tool allowlist.

When in doubt, don't leave it on. For single-file edits, quick questions, and anything interactive (no mid-run user input is possible during a workflow), ultracode is overkill that adds latency and cost without adding quality.

The bottom line

Ultracode is the setting you reach for when correctness matters more than tokens and the job is too big for one conversation to hold. It marries xhigh reasoning with dynamic workflows, moving the orchestration plan into deterministic JavaScript so Claude can coordinate up to 16 agents at a time, hundreds across a run, while keeping its own context clean for the final answer.

Use the per-turn keyword (ultracode) for one-off heavy lifts, flip on /effort ultracode when an entire session is high-stakes, and drop back to /effort high the moment you're back to routine edits. Treat it like a power tool: enormously capable, and not what you want plugged in for everyday work.

Frequently Asked Questions

Find answers to common questions

Ultracode is a Claude Code setting, not a model effort level. It sends xhigh reasoning effort to the model and additionally has Claude orchestrate dynamic workflows for substantive tasks. With it on, Claude plans and runs a multi-agent workflow for each substantive request instead of waiting for you to ask. It applies to the current session only and resets when you start a new one.

There are two paths. For a single task, include the keyword "ultracode" in your prompt (or just ask in your own words, like "use a workflow") to run that one turn as a workflow without changing the session's effort level. For the whole session, run /effort ultracode so Claude plans a workflow for every substantive task. Drop back with /effort high for routine work.

No. The effort levels are low, medium, high, xhigh, and max. There is no level called "ultra." Ultracode is a setting layered on top: it sends xhigh to the model AND adds automatic workflow orchestration. It only appears in the /effort menu on models that support xhigh, such as Fable 5, Opus 4.8, and Opus 4.7.

Dynamic workflows, the feature ultracode drives, require Claude Code v2.1.154 or later. The literal "ultracode" trigger keyword shipped in v2.1.160; before that the in-prompt keyword was "workflow." Natural-language requests like "use a workflow" work in both versions. The feature launched as a research preview on May 28, 2026, alongside Claude Opus 4.8.

A workflow spawns many agents, so a single run can use meaningfully more tokens than working through the same task in conversation, and runs count toward your plan's usage and rate limits like any other session. There's no official ultracode-specific multiplier, but for comparison Anthropic documents that agent teams use roughly 7x more tokens in plan mode. To gauge spend, run on a small slice first, one directory instead of the whole repo.

For everyday work, such as single-file edits, quick questions, and conversational back-and-forth, ultracode adds latency and cost without adding quality, because every change gets the full treatment whether it needs it or not. The official guidance is to drop back to /effort high for routine work. There's also no mid-run user input during a workflow, so it's a poor fit for interactive tasks.

Toggle Dynamic workflows off in /config, set "disableWorkflows": true in settings.json, or set the environment variable CLAUDE_CODE_DISABLE_WORKFLOWS=1. When workflows are disabled, the ultracode keyword no longer triggers a run and ultracode is removed from the /effort menu. To keep workflows but stop the in-prompt keyword from firing, disable the "Ultracode keyword trigger" toggle in /config.

Each workflow run uses up to 16 concurrent agents, fewer on machines with limited CPU cores, to bound local resource use. There's also a lifetime cap of 1,000 agents total per run, which exists as a backstop to prevent runaway loops.

Building Something Great?

Our development team builds secure, scalable applications. From APIs to full platforms, we turn your ideas into production-ready software.