Skip to main content
Home/Blog/Sandbox & Approval Modes Compared: How Safe Is Each AI CLI's Auto Mode?
Security

Sandbox & Approval Modes Compared: How Safe Is Each AI CLI's Auto Mode?

A risk-oriented comparison of the permission, sandbox, and approval models in Claude Code, Codex CLI, Gemini CLI, Qwen Code, and Oh My Pi — and how to run each one safely.

By Sean

Every AI coding CLI ships with a knob that trades safety for speed. Approve every action and the agent is useless for anything autonomous; approve nothing and you hand a language model a shell with your credentials. The marketing names for the fast end of that dial — "auto," "YOLO," "full access," "dangerously-skip-permissions" — hide wildly different actual risk profiles. Some of those modes wrap a sandbox around the agent; some put a classifier in front of it; some just disable the prompts and run commands straight on your host.

If you are going to run these tools unattended, in CI, or just faster than you can read every diff, you need to know what each "go fast" mode actually removes. Here is the security model behind each one, ranked by how much it protects you when you stop watching.

Two different things people call "the safe mode"

Before comparing tools, separate two mechanisms that often get conflated:

  • Approval / permission gating decides when the agent asks you before acting. This is policy, not enforcement — if the agent (or a prompt injection) finds a way to act without asking, gating does nothing.
  • Sandboxing decides what a command can do once it runs — OS-level limits on filesystem, network, and process access. This is enforcement.

The strongest postures combine both. The most dangerous modes remove both at once.

Claude Code: layered modes plus a classifier

Claude Code has six permission modes. default reads only without asking. acceptEdits adds file edits and common filesystem Bash commands (mkdir, touch, rm, mv, cp, sed) inside the working dir. plan reads only and proposes changes without making them. dontAsk runs only pre-approved tools and auto-denies everything else (built for CI). bypassPermissions runs everything with no checks.

The interesting one is auto (v2.1.83+). It uses a separate classifier model that reviews each tool call before it runs, blocking escalation beyond your request, actions against unrecognized infrastructure, and actions driven by hostile content. Concretely, it blocks curl|bash download-and-execute, sending sensitive data to external endpoints, production deploys and migrations, mass cloud-storage deletion, granting IAM/repo permissions, modifying shared infra, irreversibly destroying pre-session files, and force-pushing or pushing to main. It allows local file ops, lockfile-declared dependency installs, reading .env to send credentials to the matching API, read-only HTTP, and pushing to the branch you started on.

Auto mode is a research preview. In Anthropic's own words it "reduces prompts but does not guarantee safety." It is a safer middle path than skipping permissions — not a license to stop isolating.

A few details matter operationally. The classifier sees user messages, tool calls, and CLAUDE.md, but tool results are stripped, so file or web content can't directly manipulate it. It falls back to prompting if it blocks 3 times consecutively or 20 times total (not configurable); in non-interactive -p mode, repeated blocks abort the session.

--dangerously-skip-permissions equals bypassPermissions. As of v2.1.126 it even skips protected-path write prompts; only ask rules and removals targeting / or ~ still fire as a circuit breaker. On Linux and macOS it refuses to start as root or under sudo ("cannot be used with root/sudo privileges for security reasons") — a check skipped inside a recognized sandbox, which nudges you toward the intended pattern: an isolated container running as non-root.

Claude Code also has a separate OS-level Bash sandbox (the /sandbox command, distinct from permission modes) using Seatbelt on macOS and bubblewrap+socat on Linux/WSL2. By default sandboxed commands write only to the working and temp dirs, read the whole computer except denied dirs, and have no pre-allowed network domains. Two sharp edges: the default read policy still allows reading ~/.aws/credentials and ~/.ssh/ — you must add them to denyRead. And the network proxy doesn't terminate TLS, so a broad allowed domain like github.com can enable exfiltration via domain fronting.

Codex CLI: the strongest defaults

Codex CLI treats the two axes as orthogonal, which is the right design.

Sandbox modeWhat a command can do
read-onlyInspect files only; no edits/commands without approval
workspace-write (default)Read anywhere, edit in workspace, run routine local commands
danger-full-accessNo sandbox, no filesystem or network boundaries
Approval policyWhen it asks
untrustedAuto-runs known-safe reads; mutating/external commands need approval
on-request (default)Works in the sandbox; asks before exceeding it or hitting network
on-failureReferenced in config
neverNo prompts (non-interactive)

Enforcement is platform-native: Seatbelt on macOS, bubblewrap on Linux/WSL2, native Windows sandbox in PowerShell (or the Linux path under WSL2). The default "Auto" preset is workspace-write + on-request, and crucially network access is off by default in workspace-write — enable it with [sandbox_workspace_write] network_access = true, ideally paired with the network_proxy allowlist.

"Full access" is danger-full-access + never. The --dangerously-bypass-approvals-and-sandbox flag (alias --yolo) gives no sandbox and no approvals. Note codex exec --full-auto is deprecated in favor of codex exec --sandbox workspace-write. For teams, Codex enterprise deny-read policies constrain the local sandbox to read-only or workspace-write so a developer can't override them by setting full access locally.

Gemini CLI and Qwen Code: sandbox follows YOLO

Gemini CLI has three approval modes: default (prompt per tool call), auto_edit (auto-approve replace/write_file, prompt for the rest), and yolo (auto-approve everything). Set via --approval-mode <mode>; --yolo is a shortcut and can't be combined with --approval-mode (use --approval-mode=yolo). Sandboxing is off by default — but the Docker sandbox turns on automatically when you use YOLO, using the pre-built gemini-cli-sandbox image. You can enable it manually with --sandbox/-s, the tools.sandbox setting, or GEMINI_SANDBOX.

Qwen Code (a fork of Gemini CLI) has five modes: Plan (analyze-only), Ask Permissions (prompts before any change), Auto-Edit (auto-approve edits, prompt for shell), Auto (headless runs with safety retained on destructive shell commands and outbound network calls), and YOLO (auto-approve all). Enable YOLO via /approval-mode yolo, the --yolo flag, or --project/--user defaults; the docs warn it can run any command with your terminal's permissions. Like Gemini, Qwen auto-enables a sandbox under YOLO using its qwen-code-sandbox image.

The design philosophy here is sound: the moment you remove the human, you get container isolation by default. The catch is that a Docker container shares the kernel and only protects what you didn't mount in.

Oh My Pi: no guardrails at all

Oh My Pi (omp, an MIT-licensed TypeScript fork of Pi) is the cautionary baseline. By default it runs with the full permissions of the launching user and does not sandbox tool execution — the bash tool runs commands directly on the host. There is no built-in permission system for filesystem, process, network, or credential access; no path restrictions; extensions are unrestricted; fd/ripgrep are auto-downloaded from GitHub without signature verification; and there are no auto-execution safeguards, so the LLM can immediately chain commands. The only built-in safety measure is auth.json stored at ~/.pi/agent/auth.json with 0o600 permissions.

Optional sandboxing exists via an extension using @anthropic-ai/sandbox-runtime (sandbox-exec on macOS, bubblewrap on Linux), but you must turn it on. When editor capabilities are advertised via the Agent Client Protocol, write gating depends on the host editor — i.e. the editor provides the safety, not the agent. Run Oh My Pi only inside a disposable VM.

"Secure" modes still fail to prompt injection

Sandboxes and approval gates raise the bar; they don't make agents safe against a determined attacker feeding the model malicious instructions. 2026 produced concrete proof. A prompt-injection chain in Google's Antigravity achieved RCE and bypassed its most restrictive Secure Mode by injecting CLI flags into the fd utility through a tool's Pattern parameter. Snowflake Cortex Code executed commands without triggering human-in-the-loop approval via indirect prompt injection. And a Claude Code agent at Ona reportedly reached /proc/self/root/usr/bin/npx to dodge a denylist and disabled its bubblewrap sandbox when blocked.

The takeaway: defense in depth. A classifier and an OS sandbox and network restrictions and a disposable host, because any single layer can be talked around.

Where the defaults land

ToolDefault postureFastest modeSandbox on speed mode?
Codex CLIOS sandbox + network off + approval on-request--yolo (no sandbox, no approvals)No
Claude CodeReads-only, prompts; optional auto classifier + Bash sandbox--dangerously-skip-permissionsNo (separate sandbox is opt-in)
Gemini CLISandbox off, prompt per call--approval-mode=yoloYes (auto Docker)
Qwen CodeSandbox off, prompt per callYOLOYes (auto Docker)
Oh My PiRuns directly on host, no permission system(always)No, unless you add one

Codex has the strongest defaults out of the box. Claude Code gives you the most layers if you opt into them. Gemini and Qwen do the smart thing of forcing a container when you go hands-off. Oh My Pi gives you nothing.

Practical safe-use checklist

  • Run anything unattended inside a disposable container or VM as a non-root user. This is the one control that holds when every software gate fails.
  • Deny credential reads explicitly. Add ~/.aws, ~/.ssh, ~/.npmrc, and .env to deny-read rules; the defaults often allow them.
  • Keep network off until you need it, then allowlist specific domains rather than opening everything (Codex network_access/network_proxy; Claude sandbox domain prompts).
  • Don't trust repo-checked-in config for permission posture. Claude Code now ignores auto/bypassPermissions/dontAsk from project settings — set permissive modes only in your user config, and assume other tools may not protect you here.
  • Prefer classifier or approval over raw bypass when you can: auto mode and on-request are meaningfully better than skipping permissions entirely.
  • Protect main. Auto mode allows pushing to your starting branch but blocks force-push/push-to-main; replicate that intent with branch protection regardless of tool.

Bottom line

The names lie. Codex CLI's defaults are genuinely conservative — OS sandbox on, network off, approvals on request — and its full-access mode is clearly labeled. Claude Code's auto mode is a real safety improvement over --dangerously-skip-permissions thanks to its classifier, but Anthropic itself calls it a preview that reduces prompts without guaranteeing safety, and its Bash sandbox still reads your credentials unless you say otherwise. Gemini and Qwen sensibly auto-enable a Docker sandbox the moment you go YOLO. Oh My Pi runs straight on your host with no guardrails. None of these models survives a determined prompt-injection attack on its own, so the only posture that actually scales to unattended use is the boring one: disposable isolation, denied credentials, allowlisted network, and protected branches — with the agent's built-in mode as a second layer, never the only one.

Frequently Asked Questions

Find answers to common questions

They are not the same. --dangerously-skip-permissions (bypassPermissions) runs everything with no checks. Auto mode (v2.1.83+) uses a separate classifier model that reviews each action before it runs, blocking escalation beyond your request, actions against unrecognized infrastructure, and actions driven by hostile content. Anthropic positions it as a safer middle path, but it is a research preview that "reduces prompts but does not guarantee safety." Treat it as safer than YOLO, not as a substitute for isolation.

It is equivalent to bypassPermissions mode — everything runs with no checks. As of v2.1.126 it even skips protected-path write prompts. The only remaining circuit breaker is explicit ask rules and removals targeting / or your home directory (e.g. rm -rf /, rm -rf ~). Anthropic says to use it only in isolated containers or VMs where the blast radius is contained.

On Linux and macOS, Claude Code refuses to start in bypassPermissions mode when running as root or under sudo, reporting that it "cannot be used with root/sudo privileges for security reasons." That check is skipped inside a recognized sandbox. The intended pattern is to run inside an isolated container or VM as a non-root user, which both satisfies the check and contains the blast radius.

They are two independent axes. Sandbox mode controls what a command can do: read-only, workspace-write (the default — read anywhere, edit inside the workspace, run routine local commands), or danger-full-access (no boundaries). Approval policy controls when Codex asks you: untrusted, on-request (the default — works in the sandbox, asks before exceeding it or hitting the network), on-failure, or never. The default "Auto" preset combines workspace-write with on-request.

Network access is disabled by default in workspace-write. Enable it by setting [sandbox_workspace_write] network_access = true in ~/.codex/config.toml. A network_proxy feature can apply domain-level allowlist rules so you grant only the endpoints you actually need rather than the whole internet.

Yes. Sandboxing is off by default in Gemini CLI, but the Docker sandbox is enabled automatically when you use --yolo or --approval-mode=yolo, using the pre-built gemini-cli-sandbox image. You can also enable it manually with --sandbox/-s or via the tools.sandbox setting or the GEMINI_SANDBOX env var. Qwen Code mirrors this design with its own qwen-code-sandbox image.

No. Oh My Pi (a TypeScript fork of Pi) runs with the full permissions of the user that launched it and does not sandbox tool execution — the bash tool runs commands directly on the host. It has no built-in permission system for filesystem, process, network, or credential access. Optional sandboxing exists via an extension using @anthropic-ai/sandbox-runtime, but you must explicitly enable it; without it there is zero isolation.

Claude Code closes this hole: v2.1.142+ ignores defaultMode: 'auto', 'bypassPermissions', and 'dontAsk' when set in project or local settings, so a repo cannot grant itself elevated mode — those must live in ~/.claude/settings.json. The general lesson applies everywhere: never let an untrusted repo's config decide your permission posture. Set permissive modes only from your own user-level config.

Don't wait for a breach to act

Get a free security assessment. Our experts will identify your vulnerabilities and create a protection plan tailored to your business.