How do I reduce my Cursor API bill?

Cursor Pro's $20/mo covers 500 fast requests — past that you pay OpenAI / Anthropic per-call rates directly, which is why heavy users end up at $50-$200/mo. The fix is to point Cursor's Custom API at CodeRouter and set model to 'auto'. CodeRouter detects what phase of coding the request is (planning, implementation, debugging, test generation, docs) and routes to the cheapest capable model per phase — Opus only for planning, DeepSeek V3 for test generation, Haiku for docstrings. Same Cursor IDE, same keyboard shortcuts, 70-90% lower monthly bill. Setup takes 2 minutes — just change base_url to https://www.coderouter.io/api/v1 and paste your cr_ API key.

What is the cheapest API for Claude Code / Aider / Copilot?

There isn't a single 'cheapest API' — the cheapest model depends on what the coding agent is doing. For planning and architecture, you still want Claude Opus 4.7 or Sonnet 4.6. For implementation, DeepSeek V3.2 ($0.28/$0.42 per 1M) and Qwen 3 Coder are 30-50x cheaper than Opus with near-equivalent code quality. For test generation, DeepSeek V3 or Haiku 4.5 is 15-50x cheaper. For docstrings and simple formatting, Haiku 4.5 ($1/$5) or Gemini 2.5 Flash ($0.30/M output) is 15-250x cheaper. CodeRouter is the gateway that picks per request automatically — aim a single base_url at https://www.coderouter.io/api/v1 from Claude Code, Aider, Copilot (via LiteLLM), Cursor, Windsurf, or any OpenAI-compatible agent.

Does DeepSeek V3 work as well as Claude Sonnet for coding?

For implementation and test generation phases — yes, DeepSeek V3.2 matches Claude Sonnet 4.6 on HumanEval, MBPP, and LiveCodeBench within 1-3 points. For multi-file refactoring and architecture planning, Sonnet still has an edge on long-context reasoning. DeepSeek V3 costs $0.28 input / $0.42 output per 1M tokens, vs Sonnet's $3/$15 — roughly 30-50x cheaper. The right answer for most coding agents is not 'pick one forever' but 'use DeepSeek V3 for the implement/test phases and Sonnet for the plan/refactor phases.' That's phase-aware routing in practice — CodeRouter decides per request in ~10ms.

What is phase-aware LLM routing?

Phase-aware LLM routing classifies each coding-agent request by what phase of software work it represents — planning, implementation, debugging, testing, refactoring, or documentation — and routes it to the cheapest model that can handle that specific phase. A 'write unit tests for this function' request goes to DeepSeek V3 ($0.42/M). A 'refactor this multi-file feature and plan the migration' request goes to Claude Opus 4.7 ($75/M). This is different from picking one model for everything, and different from OpenRouter-style model-selection (which still requires you to choose manually). CodeRouter's classifier runs in ~10ms on the server, so the agent never notices the extra hop.

CodeRouter vs OpenRouter — which saves more money on coding?

OpenRouter is a model marketplace — it gives you access to 300+ models behind one API key, but you still pick which model to send each request to. Most Cursor / Aider / Claude Code users default to the premium model (Opus, GPT-5) for everything and end up paying full price. CodeRouter is a phase-aware router — set model to 'auto' and we pick the cheapest capable model per request based on the coding phase. CodeRouter also adds things OpenRouter doesn't: coding-specific capability scores per model (implementation, debug, test, refactor), per-end-user attribution for SaaS agent builders, and built-in quota + top-up billing. For pure coding workloads, typical CodeRouter savings are 70-90% vs picking one model on OpenRouter.

Will CodeRouter break my Cursor / Aider / Claude Code agent?

No. CodeRouter exposes a standard OpenAI-compatible chat completions endpoint (POST /api/v1/chat/completions) with the same request and response format your agent already uses — including streaming, tool use, and function calling. We implement the same JSON schema and stream format, so Cursor, Aider, Claude Code, Cline, Continue.dev, Windsurf, OpenClaw, and any LiteLLM-wrapped client work unmodified. If a routed model fails, the fallback chain tries up to 2 alternates automatically (on 429, 500-504, timeouts, missing keys). You can also pin an explicit model instead of 'auto' any time.

How do I set up CodeRouter with Cursor in 2 minutes?

1) Sign up free at https://www.coderouter.io/login and copy your API key (starts with cr_). 2) In Cursor, open Settings -> Models -> OpenAI API Key, and under 'Override OpenAI Base URL' paste https://www.coderouter.io/api/v1. Paste your cr_ key in the API Key field. 3) Add 'auto' to the Custom Models list and select it as your active model. That's it — phase-aware routing is live. Aider users set OPENAI_API_BASE and OPENAI_API_KEY env vars to the same values. Claude Code users set ANTHROPIC_BASE_URL to https://www.coderouter.io/api/v1 and ANTHROPIC_API_KEY to the cr_ key. Full guide at https://www.coderouter.io/setup.

70-90% off Claude Code · No code changes · Free trial

Cut your Claude Code bill
70-90% — without switching tools.

Your agent sends planning calls, implementation calls, debugging calls, small edits. CodeRouter detects each phase and routes to the cheapest model that still matches Claude's quality — typical 70–90% savings vs direct Anthropic.

Start free trial →See pricing

✓ Works with Claude Code, Cursor, Aider, Copilot✓ OpenAI-compatible — swap one URL✓ 2-day free trial · no credit card

CODEROUTER VS OPENROUTER

OpenRouter routes models. We route tasks.

Same OpenAI-compatible API. But instead of making you pick a model manually for every request, CodeRouter detects what phase of coding your agent is in — planning, implementation, debugging, test generation — and picks the cheapest model that can actually handle it.

Built for

Generic LLM gateway — chat, search, embeddings, anything

Coding agents specifically (Claude Code, Cursor, Aider, Copilot)

Routing logic

You pick the model manually in each request

Auto per task phase — plan → Opus, small edit → Haiku

Typical savings

A few percent off the provider's list price

70–90% off vs going direct to Claude Code / Anthropic

Integration

Requires picking the right model per call site

Swap one env var — Claude Code, Aider, Cursor just work

FOR CLAUDE CODE USERS

Claude Code $200/mo too expensive?

Plug CodeRouter into your existing Claude Code setup. Keep the exact same CLI, get the same quality output, pay 70-90% less. Bring your own Anthropic or OpenAI keys — or use ours.

$200/mo

Heavy Claude Code user

→

$99/mo

With CodeRouter Pro

Keep Claude Code — cut the bill 70-90% →

The problem

Right now, every "fix this typo" costs Opus prices.

Coding agents default to a single premium model — Claude Opus 4.7 at $15/$75 per 1M tokens. They use it for architecture decisions AND for // TODO: rename variable. That's why your bill balloons.

Planning (5% of calls)

$33/M

Claude Opus 4.7

$33/M

Implementation (55%)

$33/M

DeepSeek V3 / Sonnet 4.6

$1.26/M

Debugging (10%)

$33/M

Sonnet 4.6 / R1

$3.46/M

Test generation (15%)

$33/M

DeepSeek V3

$0.32/M

Small edits + docs (15%)

$33/M

Haiku 4.5 / Gemini Flash

$1.25/M

Blended average

$33/M

(weighted mix)

~$2.30/M

Math based on typical coding-agent workload mix (70% input / 30% output tokens). Real savings depend on your workload — the CodeRouter dashboard shows your actual number per week.

How phase-aware routing works

We analyze each request in <10ms and route it.

Phase detection is built from regex over the last user message + inspection of recent tool-call history + agent fingerprinting (we know Cursor composer mode, Aider architect mode, Claude Code plan mode, etc). Confidence ≥ 70% → route decisively.

Plan

Architecture & design

"How should I structure this service? What's the best caching strategy?" — high reasoning, low frequency.

→ Claude Opus 4.7 / GPT-5.2

Implement

Code generation

"Write a function that..." "Add error handling to this." The bulk of your agent's work — 50–60% of calls.

→ Sonnet 4.6 / DeepSeek V3

Debug

Errors & failures

Tool result contained a stack trace? Failed test? We know — route to reasoning-strong models.

→ Sonnet 4.6 / DeepSeek R1

Test

Test generation

Writing unit/integration tests is pattern-heavy. Cheap models do this as well as premium — for 100× less cost.

→ DeepSeek V3 / Kimi K2.5

Refactor

Restructuring

"Extract this into a helper", "simplify this method" — medium reasoning, high frequency.

→ Sonnet 4.6 / DeepSeek V3

Document

Docstrings & READMEs

Template-heavy, low reasoning — prime Haiku/Flash territory.

→ Haiku 4.5 / Gemini Flash

Small edit

Formatting & renames

"Fix the typo on line 42." Using Opus for this is criminal. We route to the cheapest capable model.

→ GPT-5 Mini / Flash

Fallback

Uncertain requests

Low-confidence phase? We stay conservative — upgrade to a tier-1 model rather than guess wrong.

→ Sonnet 4.6 default

Setup · 2 minutes

Change one `base_url`. That's it.

If your agent speaks the OpenAI chat-completions API, it speaks CodeRouter. Cursor, Aider, Claude Code, Copilot, OpenClaw, Windsurf, or your own SDK.

# Cursor → Settings → Models
1. Find "OpenAI API Key" section, toggle:
   "Override OpenAI Base URL" → ON
   URL: https://www.coderouter.io/api/v1
   Key: cr_your_coderouter_key
2. Click Verify (must be green)
3. Scroll up to "Models" → "+ Add custom model"
   Name: auto
4. Tick the auto checkbox to enable it,
   then pick "auto" in the chat model dropdown.

Transparency

We show you every routing decision.

Every response carries headers explaining exactly which model we picked, why, and how much it cost. No black box.

X-CodeRouter-Model: deepseek-chat

X-CodeRouter-Phase: implement

X-CodeRouter-Phase-Confidence: 0.85

X-CodeRouter-Agent: cursor

X-CodeRouter-Cost: 0.000123

X-CodeRouter-Cost-If-Opus: 0.012874

X-CodeRouter-Savings: 99.0%

WHAT USERS SAY

Same workflow. 70-90% less spend.

Cut my $400/mo Claude Code bill to $60 just by pointing it at CodeRouter. Same output quality.

Max · indie dev

Same quality output, 80% cheaper. Team of five switched over a weekend — zero code changes.

Samir · startup CTO

Pricing

Simple platform pricing.

Plus 15% markup over provider list price on any overage — verifiable on each provider's pricing page, with a per-plan cap so you never see a runaway bill.

Solo

$29 /mo

5M tokens + 100K Opus included
Phase-aware routing
All 15+ coding models
+15% overage (cap $60/mo)

Start Solo

Pro MOST POPULAR

$99 /mo

30M tokens + 500K Opus included
Priority Opus routing
Agent fingerprint analytics
+15% overage (cap $200/mo)

Start Pro

Team

$299 /mo

80M tokens + 1M Opus included
Up to 50 API keys + shared dashboard
SSO-ready
+15% overage (cap $600/mo)

Start Team

See all 6 plans + FAQ →

Cut your Claude Code bill70-90% — without switching tools.

OpenRouter routes models. We route tasks.

Claude Code $200/mo too expensive?

Right now, every "fix this typo" costs Opus prices.

We analyze each request in <10ms and route it.

Architecture & design

Code generation

Errors & failures

Test generation

Restructuring

Docstrings & READMEs

Formatting & renames

Uncertain requests

Change one base_url. That's it.

We show you every routing decision.

Same workflow. 70-90% less spend.

Simple platform pricing.

Get weekly AI cost optimization tips

Stop paying Opus pricesfor variable renames.

Cut your Claude Code bill
70-90% — without switching tools.

Change one `base_url`. That's it.

Stop paying Opus prices
for variable renames.