Why is Codex CLI slower than the ChatGPT web interface?

Codex CLI includes additional overhead for code analysis, file reading, and command execution context. Large codebases increase context size which affects response time. The CLI also streams responses which may feel slower than seeing a complete response instantly.

How can I speed up Codex CLI responses?

Use a smaller model when possible (gpt-4o-mini for simple tasks), reduce context by excluding large files with .codexignore, clear conversation history for new tasks, and ensure stable network connectivity. Also check if OpenAI services are experiencing latency issues.

Does the approval mode affect performance?

Yes, 'suggest' mode requires confirmation for each action which adds round-trip time. 'auto-edit' or 'full-auto' modes are faster for trusted operations but require careful use. Choose the mode that balances safety and speed for your workflow.

Why do some Codex requests timeout?

Timeouts occur when processing very large files, complex multi-file operations, or during OpenAI API congestion. Increase the timeout setting in config, break large tasks into smaller chunks, or try again during off-peak hours.

How to Fix OpenAI Codex CLI Slow Performance

OpenAI Codex CLI is powerful, but performance issues can disrupt your workflow. Slow responses, timeouts, and laggy interactions often have identifiable causes and straightforward solutions. This guide covers how to diagnose and fix common performance problems.

Common Causes of Slow Performance

Understanding why Codex CLI runs slowly helps you target the right fix.

Context Size Overhead

Codex CLI reads your codebase to provide context-aware responses. Large repositories with many files significantly increase the context window size, which directly impacts:

Initial response time: More context means more tokens to process
Memory usage: Large contexts require more local and server-side resources
Token costs: Larger contexts consume more of your quota faster

Model Selection

Different models have different speed characteristics:

Model	Speed	Best For
gpt-4o-mini	Fastest	Simple tasks, quick questions
gpt-4o	Moderate	Balanced speed and capability
o1-preview	Slowest	Complex reasoning, architecture decisions

Network Latency

Your connection to OpenAI's API affects every interaction. High latency manifests as:

Delays before responses start streaming
Intermittent pauses during output
Frequent timeouts on longer operations

Approval Mode Overhead

The suggest approval mode requires confirmation for each file change, adding round-trip time for every operation. While safer, it slows down multi-step tasks significantly.

Measuring and Diagnosing Latency

Before optimizing, measure your baseline performance.

Check OpenAI API Status

First, verify the problem is not on OpenAI's end:

# Check OpenAI status page
curl -s https://status.openai.com/api/v2/status.json | jq '.status.description'

Or visit status.openai.com directly.

Measure Response Time

Time a simple request to establish your baseline:

time codex "what is 2+2"

Compare this to a context-heavy request:

time codex "explain the architecture of this codebase"

The difference reveals how much context loading affects your performance.

Check Your Network

Test your connection to OpenAI's API:

ping api.openai.com
curl -o /dev/null -s -w "Connect: %{time_connect}s\nTTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n" https://api.openai.com/v1/models

Look for connect times under 100ms and TTFB (Time To First Byte) under 500ms for optimal performance.

Configuration Optimizations

Select the Right Model

Switch to faster models for routine tasks:

# Use gpt-4o-mini for quick operations
codex --model gpt-4o-mini "format this JSON file"

# Reserve o1-preview for complex reasoning
codex --model o1-preview "design a caching strategy for this API"

Configure your default model in ~/.codex/config.toml:

[model]
default = "gpt-4o-mini"  # Faster default for routine tasks

Adjust Timeout Settings

Increase timeouts for large operations:

[network]
timeout = 120  # seconds, default is often 60

Configure Context Limits

Limit how much context Codex reads:

[context]
max_files = 50        # Limit files read
max_file_size = 100   # KB, skip files larger than this

Using .codexignore to Reduce Context

Create a .codexignore file in your project root to exclude files from context:

# Dependencies (always exclude)
node_modules/
vendor/
.venv/
__pycache__/

# Build artifacts
dist/
build/
*.min.js
*.map

# Large data files
*.csv
*.json
data/
fixtures/

# Generated code
*.generated.*
coverage/

# Media files
*.png
*.jpg
*.mp4

Strategic Exclusions

Focus exclusions on:

Dependencies: node_modules, vendor, virtual environments
Build output: Compiled files, bundles, sourcemaps
Test fixtures: Large test data files
Generated code: Auto-generated files that change frequently
Binary files: Images, videos, compiled binaries

After creating .codexignore, restart Codex to apply changes.

Choosing Approval Modes for Speed

Codex CLI offers three approval modes with different performance characteristics:

Suggest Mode (Default)

codex --approval suggest "refactor this function"

Requires approval for each change
Safest but slowest for multi-file operations
Best for learning or unfamiliar codebases

Auto-Edit Mode

codex --approval auto-edit "add error handling to all API calls"

Automatically applies file changes
Still requires approval for shell commands
Good balance of speed and safety

Full-Auto Mode

codex --approval full-auto "run tests and fix any failures"

Automatically applies all changes and runs commands
Fastest for trusted operations
Use only in sandboxed environments or with version control

For repetitive trusted tasks, auto-edit or full-auto can reduce completion time by 50-70%.

Network and API Latency Troubleshooting

Use a Faster DNS

Switch to a faster DNS resolver:

# Test Cloudflare DNS
dig @1.1.1.1 api.openai.com

# Test Google DNS
dig @8.8.8.8 api.openai.com

Configure your system to use the faster option.

Check for VPN Interference

VPNs can add significant latency. Test with VPN disabled:

# Disconnect VPN, then test
time codex "hello"

# Reconnect VPN, test again
time codex "hello"

If VPN adds significant latency, consider split tunneling to route OpenAI traffic directly.

Off-Peak Usage

OpenAI's API experiences higher latency during peak hours (typically 9 AM - 6 PM Pacific Time, weekdays). For large operations, consider scheduling during off-peak hours.

When to Clear Context and Start Fresh

Start a new session when:

Switching projects: Old context pollutes new project responses
After major refactoring: Stale context causes confusion
When responses seem confused: Accumulated context may be contradictory
Performance degrades over time: Long sessions accumulate history

Clear your session:

# Exit and restart Codex
exit
codex

# Or use the clear command within a session
/clear

Model Speed Comparison

Use this reference when choosing models:

Model	Avg Response Time	Token Limit	Best Use Cases
gpt-4o-mini	1-3 seconds	128K	Quick edits, formatting, simple generation
gpt-4o	3-8 seconds	128K	Code review, moderate complexity tasks
o1-preview	10-30 seconds	128K	Architecture, complex debugging, planning
o1-mini	5-15 seconds	128K	Reasoning tasks with speed priority

For most coding tasks, gpt-4o-mini provides the best balance of speed and capability. Reserve slower models for tasks that genuinely require deeper reasoning.

Next Steps

Learn how to switch models in Codex CLI for different tasks
Set up MCP servers for extended capabilities
Explore session management to preserve context across work sessions

How to Fix OpenAI Codex CLI Slow Performance

Common Causes of Slow Performance

Context Size Overhead

Model Selection

Network Latency

Approval Mode Overhead

Measuring and Diagnosing Latency

Check OpenAI API Status

Measure Response Time

Check Your Network

Configuration Optimizations

Select the Right Model

Adjust Timeout Settings

Configure Context Limits

Using .codexignore to Reduce Context

Strategic Exclusions

Choosing Approval Modes for Speed

Suggest Mode (Default)

Auto-Edit Mode

Full-Auto Mode

Network and API Latency Troubleshooting

Use a Faster DNS

Check for VPN Interference

Off-Peak Usage

When to Clear Context and Start Fresh

Model Speed Comparison

Next Steps

Frequently Asked Questions

Building with AI?

How to Configure Approval and Sandbox Modes in OpenAI Codex CLI

How to Fix OpenAI Codex CLI Context Window Exceeded Errors

Install OpenAI Codex CLI: Complete Setup Guide for macOS, Windows & Linux

LLM Token Counter

JSON Formatter

JWT Decoder

How to Fix OpenAI Codex CLI Slow Performance

Frequently Asked Questions

Building with AI?

Related Articles

How to Configure Approval and Sandbox Modes in OpenAI Codex CLI

How to Fix OpenAI Codex CLI Context Window Exceeded Errors

Install OpenAI Codex CLI: Complete Setup Guide for macOS, Windows & Linux

Related Tools

LLM Token Counter

JSON Formatter

JWT Decoder