You are deep into a refactor, the agent is on a roll, and then it stops: you have hit your usage limit. If you are on a paid Claude subscription, this is the part of Claude Code that trips people up most, because the limits are not one number and they do not reset when you expect. There are two overlapping caps, they share quota with your Claude.ai chats, and Anthropic has stopped publishing the exact figures.
This is a guide to what is actually enforced, when it resets, and the concrete moves that keep you under the line.
The two limits, side by side
On paid subscription plans, Claude Code enforces two overlapping usage limits, and they apply at the same time:
| Limit | Window | What resets it |
|---|---|---|
| 5-hour session | Rolling 5 hours | 5 hours after your first request in the session |
| Weekly | Rolling 7 days | As your 7-day window advances from your first prompt |
The key word for both is rolling. Neither one snaps to a calendar boundary.
- The 5-hour window starts the moment you send your first request in a session, not at midnight and not on the hour. Five hours after that first request, the session allocation resets.
- The weekly limit is a rolling 7-day window measured from your first prompt. It does not reset on Mondays. Because the window keeps rolling forward as you use Claude, the "Resets by ..." time in the UI can shift as you go.
If you take one thing away: the weekly cap is not a Monday-morning reset. It is a sliding 7-day window anchored to when you started. Plan intensive work against the actual reset time shown in
/usage, not against the calendar.
Usage is pooled across all Claude products
A point that catches a lot of people: on Pro and Max plans, your quota is shared across all Claude products in the same window. Claude Code, Claude.ai chat, and Claude Desktop all draw from the same pool. A long brainstorming session in the web chat eats into what Claude Code has left for the same window, and vice versa. If you are rationing quota, budget for everything Claude you touch that day, not just the terminal.
How the plans differ
Here is where you have to let go of old numbers. As of 2026, Anthropic has stopped publishing fixed prompts-per-window and hours-per-week figures. The official docs now describe only relative capacity:
- Pro — the baseline.
- Max 5x ($100/mo) — roughly 5x more usage per session than Pro.
- Max 20x ($200/mo) — roughly 20x more than Pro.
If you find a blog post quoting an exact current weekly-hour figure, treat it as stale.
For rough historical context only: when weekly limits first launched (originally published July 28, 2025), the estimated ranges were on the order of ~40–80 Sonnet hours/week for Pro, ~140–280 Sonnet plus ~15–35 Opus hours for Max 5x, and ~240–480 Sonnet plus ~24–40 Opus hours for Max 20x. Those were estimates that varied heavily with codebase size, and they are not current published figures. Do not size your plan around them.
The Max plan's two weekly limits
Max plans carry two weekly limits, not one:
- A limit that applies across all models.
- A second limit that, per the official Max help article, applies to Sonnet models only.
(A third-party source framed the second limit as Opus-specific instead, so the exact wording is contested. Trust Anthropic's "Sonnet models only" phrasing.)
This two-limit structure is why people on Max 20x sometimes report being throttled on Opus while their overall weekly usage still looks healthy. Opus draws far more quota per message than Sonnet, and there are reports (Issue #8449) of Opus on Max 20x being consumed unusually fast after Claude Code v2. If you run Opus continuously, you can exhaust the Opus-relevant allocation well before you touch the overall weekly cap.
Some sources also claim Max plans auto-switch from Opus to Sonnet at specific usage thresholds (e.g. 20% for Max 5x, 50% for Max 20x). The official Max help article does not confirm those percentages, so treat them as unverified.
What changed in May 2026
There were real limit increases this year, but be careful which ones are confirmed:
- Confirmed (May 6, 2026): Anthropic announced it doubled the 5-hour rate limits for Pro, Max, Team, and seat-based Enterprise plans, and removed the peak-hours reduction for Pro and Max, so limits no longer shrink during high-demand periods. Opus API rate limits were raised as well.
- Unverified: Several third-party sources report weekly limits were also raised ~50% around May 13, 2026, described as temporary with an expiration around July 13, 2026. The official Anthropic announcement does not mention any weekly increase or expiration date. Treat the weekly-increase and the expiration as unconfirmed.
Checking your usage: /usage vs /cost
There are two slash commands, and they are not interchangeable.
/usage is the one you want on a subscription. Run it in the Claude Code terminal and it shows:
- remaining quota,
- the reset time, and
- your current burn rate,
for both the 5-hour window and the weekly limit. This is the right way to check status before a long session. Settings > Usage on claude.ai shows the same reset info.
/cost shows token usage and running spend for the current session. It is intended for API-key (pay-as-you-go) users. For Pro/Max subscribers it is not a reliable indicator — Anthropic has acknowledged that /cost displays misleading information about Max subscription limits (Issue #1287). Do not gauge how much subscription quota you have left from /cost.
One more nuance: UI reset times in Claude Code are often rounded to tidy clock times and should be treated as approximate. For API users, the response headers carry precise timestamps (e.g. anthropic-ratelimit-requests-reset, an RFC3339 value) if you need the exact moment.
How to actually avoid hitting the limit
The caps reward disciplined usage. In rough order of impact:
1. Pick the right model
Model choice is the biggest single lever. Opus consumes meaningfully more quota than Sonnet. Anthropic's own guidance:
- Sonnet for routine coding,
- Opus for genuinely complex problems,
- Haiku for quick, simple tasks.
Routing simple work off Opus conserves the shared pool more than any other single change. If you live in Opus by default, switching the routine 80% of your work to Sonnet is the highest-leverage thing you can do.
2. Keep context small
Context length directly drives token consumption, and tokens drive quota. Every file read and every diff stays in context and is re-sent on each subsequent message. A bloated session quietly multiplies the cost of each new turn.
- Use
/compactto compress the conversation history when a session gets long. - Use
/clearto reset context between unrelated tasks, so the next task does not pay for the last one.
3. Use subagents — carefully
Delegating file reads, searches, and analysis to subagents keeps that token load out of your primary agent's active context, which protects the main window. The caveat: subagents and MCP tools also consume tokens, and MCP-tool overhead is commonly underestimated. Subagents are a context-management win, not free compute. Prune MCP servers you are not using.
4. Watch the burn rate, not just the gauge
Before a big push, run /usage and look at the burn rate, not only the remaining percentage. A high burn rate on Opus with a weekly window that is already half consumed is a signal to drop to Sonnet now rather than get cut off mid-task.
When you run out
If you hit a limit, new requests are blocked until the relevant window resets — the 5-hour window 5 hours after your first request, the weekly cap as your rolling 7-day window advances. Files already written to disk are not lost. Check /usage for the exact reset time before you start anything heavy again.
If you regularly slam into the caps, the structural fix is an API key instead of a subscription. Console / pay-as-you-go usage has no subscription cap — you are billed per token and monitor spend via /cost and the Console dashboard. The trade is predictability: a subscription has a ceiling on cost but a ceiling on usage; the API has neither.
A third option, if you have spare hardware, is to take some inference off the cloud entirely. Because Claude Code can point its base URL and model provider at any OpenAI-compatible endpoint, a local-first AI gateway like Wide Area AI can serve requests from your own machines first and fall back to a cloud provider only when local capacity runs out — routine work that never touches the cloud doesn't draw down a quota at all.
Bottom line
- Two limits run at once: a rolling 5-hour session window and a rolling 7-day weekly cap. Neither resets on the calendar, and the weekly one does not reset on Mondays.
- Quota is pooled across Claude Code, web chat, and Desktop on Pro/Max.
- Anthropic no longer publishes exact figures — think in relative terms (Max 5x ≈ 5x Pro, Max 20x ≈ 20x Pro) and check
/usage, not old blog numbers. - May 2026 confirmed a doubled 5-hour limit and removal of peak-hours throttling; the rumored weekly bump and its expiration are unverified.
- To stay under the line: Sonnet by default,
/compactand/clearto control context, subagents for heavy reads (mind the MCP overhead), and/usage— not/cost— to know where you stand. - For genuinely heavy, continuous work, an API key removes the caps entirely, at the cost of an uncapped bill.