If you opened your terminal one morning in mid-April and got hit with Qwen OAuth free tier was discontinued on 2026-04-15, you weren't alone. That error broke a lot of workflows overnight, and the headlines didn't help — "Free Qwen Is Dead" ran across Decrypt, Yahoo Tech, and others around mid-April. The framing made it sound like Alibaba had nuked the whole project.
It didn't. The reality is narrower, and once you understand the distinction, you can be back to coding with Qwen3-Coder — for free — in about ten minutes. Let's clear up the confusion and walk through the three free paths that still work.
What actually changed (and what didn't)
Here's the single most important thing to get straight: "Qwen Code" the CLI is still free and open source. What ended was the free hosted OAuth login that gave you free cloud inference through Alibaba's servers. Those are two different things, and conflating them is the root of nearly every "Qwen is dead" take.
The shutdown happened in two stages. First, Alibaba cut the free OAuth quota from 1,000 requests/day down to 100 requests/day. Then, on April 15, 2026, they closed the free OAuth tier entirely. The official README now states plainly that the "Qwen OAuth free tier has been discontinued on April 15, 2026," and the docs surface the same message as the auth error you're probably seeing. The stated reason, per the policy adjustment issue on GitHub, was "a product strategy adjustment to better manage the free tier usage and costs."
The CLI didn't go away. The free inference did. You keep the tool; you now bring your own model.
So your install of qwen is fine. Your config and history are fine. You just need to re-authenticate against a source of inference that isn't the dead OAuth endpoint.
Re-authenticating: the one command you need
To switch auth methods, run the CLI and trigger the auth flow:
qwen
/auth
From there, the official docs and README point you to three replacement paths:
- API Key via Alibaba Cloud Model Studio / DashScope
- The paid Coding Plan (Alibaba's hosted subscription)
- Local Inference via Ollama / vLLM — explicitly described as the "recommended free alternative"
The README and discontinuation notices also name OpenRouter and Fireworks AI as bring-your-own-key (BYOK) providers. That gives us our menu. Let's price out the paid option, then cover the three genuinely free routes.
The paid option: Qwen Coding Plan
If you want a turnkey hosted experience and don't mind paying, the Qwen Coding Plan (a.k.a. the Alibaba Cloud Coding Plan) is the official answer. The headline/Pro tier is $50/month.
According to a detailed secondary source, that tier includes:
| Limit | Allowance |
|---|---|
| Per month | ~90,000 requests |
| Per rolling 5-hour window | 6,000 requests |
It supports a wide model lineup — qwen3-coder-next, qwen3-coder-plus, qwen3-max, kimi-k2.5, glm-5, and MiniMax-M2.5 — through the endpoint https://coding.dashscope.aliyuncs.com/v1.
A word of caution on pricing you'll find elsewhere: some third-party marketplaces advertise much cheaper tiers (around $10/month "Lite," $7 "Standard," $22 "Business"). Those don't match the official $50 figure and appear to come from plan-reseller sites that may be conflating different products. Treat anything other than the $50 plan as unverified. Similarly, a claim floating around that the free developer API tier was replaced with a one-time 70M-token trial doesn't appear in the Qwen Code README — so I'd treat that as unconfirmed too.
If $50/month isn't where you want to be, here are the three free alternatives.
Free alternative #1: OpenRouter (~1,000 requests/day)
OpenRouter hosts Qwen3-Coder as a genuinely free model:
- Model ID:
qwen/qwen3-coder:free - Architecture: 480B-A35B MoE
- Context: 1M tokens
- Price: $0
Because it's an OpenAI-compatible endpoint, it drops straight into Qwen Code as a BYO key. The catch is the rate limits, which have some nuance worth understanding:
- Free models are capped at 20 requests/minute, regardless of your tier.
- You get roughly 50 requests/day by default.
- You get 1,000 requests/day once you've purchased $10 or more in credits at any point — a one-time threshold that never expires.
So the widely-cited "~1,000/day free Qwen3-Coder" access isn't free out of the box; it requires a single $10 credit purchase to unlock, after which the higher daily ceiling sticks permanently. For most solo developers, that $10 one-time spend is the best value on this list — far cheaper than the $50/month plan if your usage fits inside ~1,000 requests/day.
One more OpenRouter perk worth knowing: all users get 1,000,000 free BYOK requests/month (routing your own provider key through OpenRouter), after which a 5% routing fee applies to normal model pricing.
Free alternative #2: Run it locally (no key, no limits)
This is the path the official docs call the recommended free alternative, and it's my favorite for anyone with the hardware. Qwen models are open-weight under Apache 2.0, which means you can run them locally at zero cost, with no API key and no usage limits, then point the Qwen Code CLI at your local server.
The documented runners are Ollama, vLLM, and llama.cpp. The general workflow looks like this:
# Option A: Ollama
ollama pull <qwen3-coder-model>
ollama serve # exposes an OpenAI-compatible endpoint
# Option B: llama.cpp
# build llama.cpp, pull a quantized GGUF (e.g. an Unsloth Q4_K_XL),
# then run the OpenAI-compatible server
Then in Qwen Code, run /auth and point it at your local endpoint.
The hardware question is the real gate here. Qwen3-Coder-Next (announced February 2026) is an 80B MoE model with 3B active params and 256K context, designed for fast local agentic coding. It needs roughly:
- ~46GB RAM/VRAM/unified memory at 4-bit
- ~85GB at 8-bit
A 64GB Apple Silicon Mac is cited as a 4-bit "sweet spot." If you've got a workstation with a big GPU or a high-memory Mac, local is the cleanest answer: it's private, it's offline-capable, and there's genuinely no quota to hit.
If you'd rather not point Qwen Code directly at a single machine, a local-first AI gateway like Wide Area AI sits in front of your hardware as an OpenAI-compatible endpoint — serving requests from your own nodes at zero per-token cost and only failing over to a cloud provider when those nodes are offline. You just set Qwen Code's base URL to the gateway and keep the local-first economics without the single-point-of-failure.
Free alternative #3: BYO API key with free credits
The third route is BYOK against any OpenAI-compatible provider that gives you free starting credits or a free model. OpenRouter (above) is the obvious one, but the README also names Fireworks AI as a BYOK provider, and an Alibaba Cloud Model Studio / DashScope API key is a first-class supported option in Qwen Code.
The mechanics are identical across providers: get a key, run /auth in Qwen Code, select the API-key method, paste it in, and set the model. This is the most flexible option because it lets you mix and match — for example, a free local model for routine edits and a hosted key for the occasional heavy agentic task.
Which one should you pick?
| Path | Cost | Best for |
|---|---|---|
OpenRouter :free | $0 (or $10 once for 1,000/day) | Solo devs who want hosted convenience |
| Local (Ollama/llama.cpp) | $0 + hardware | Anyone with 46GB+ memory who wants privacy and no limits |
| BYO key (Fireworks/DashScope) | Varies / free credits | Mixing providers, flexibility |
| Coding Plan | $50/month | Teams or heavy users wanting an official hosted SLA |
Bottom line
The "Free Qwen Is Dead" headlines oversold it. The Qwen Code CLI is still free and open source — only the free hosted OAuth inference ended, on April 15, 2026. If you're staring at that auth error, run qwen then /auth and pick one of three free routes: the OpenRouter free model (best value after a one-time $10 unlock for 1,000 requests/day), fully local inference via Ollama or llama.cpp (zero cost, no limits, if your hardware can handle ~46GB at 4-bit), or a BYO API key with free credits. The $50/month Coding Plan is there if you'd rather pay for an official hosted experience — but you don't have to. Treat any sub-$50 "official" pricing or mystery free-trial claims with skepticism; they come from third-party sources that don't line up with Alibaba's own docs.