Stop paying OpenAI and Anthropic for thinking your code doesn't need

Weave routes every prompt to the cheapest model that can actually do the job. Same quality, 30-60% lower spend.

$ weave route --file hero-cta.md [weave-router-model: kimi-k2-0905]
>
>
* Typing · 0.1s
WeaveClaudeOpus 4.7COST$$$$Q · 94$ · 1864ClaudeSonnet 4.7COST$$Q · 86$ · 5574GPT-5Codex 5.5COST$$$Q · 88$ · 4270GPT-5miniCOST$Q · 78$ · 8674Gemini2.5 ProCOST$$$Q · 84$ · 4870KimiK2COST$Q · 82$ · 9694DeepSeekV3COST$Q · 78$ · 8874Llama4 405BCOST$$Q · 80$ · 7074
$ weave route --file hero-cta.md [weave-router-model: kimi-k2-0905]
>
>
* Typing · 0.1s
WeaveClaudeOpus 4.7COST$$$$Q · 94$ · 1864ClaudeSonnet 4.7COST$$Q · 86$ · 5574GPT-5Codex 5.5COST$$$Q · 88$ · 4270GPT-5miniCOST$Q · 78$ · 8674Gemini2.5 ProCOST$$$Q · 84$ · 4870KimiK2COST$Q · 82$ · 9694DeepSeekV3COST$Q · 78$ · 8874Llama4 405BCOST$$Q · 80$ · 7074
Trusted by leading engineering teams
Trusted by leading engineering teams
Trusted by leading engineering teams

Scale-Ups

Startups

Enterprise

$ weave route --file hero-cta.md [weave-router-model: kimi-k2-0905]
>
>
* Typing · 0.1s
WeaveClaudeOpus 4.7COST$$$$Q · 94$ · 1864ClaudeSonnet 4.7COST$$Q · 86$ · 5574GPT-5Codex 5.5COST$$$Q · 88$ · 4270GPT-5miniCOST$Q · 78$ · 8674Gemini2.5 ProCOST$$$Q · 84$ · 4870KimiK2COST$Q · 82$ · 9694DeepSeekV3COST$Q · 78$ · 8874Llama4 405BCOST$$Q · 80$ · 7074

Frequently asked questions

Classification runs in under 50ms on a small distilled model that lives in our edge layer. The first token from the picked model still streams in the same envelope it would have without Weave, usually faster, because we route past the most-congested providers when we can.
Anthropic (Opus, Sonnet, Haiku), OpenAI (GPT-5, Codex, mini), Google (Gemini 2.5 Pro/Flash), Moonshot (Kimi K2), DeepSeek V3, Mistral, Llama 4, plus any local or self-hosted endpoint that speaks the OpenAI HTTP shape. Weave is model-agnostic by design.
We take a percentage of measured savings against your pre-Weave baseline. No flat fee, no minimum, no per-seat math. If we don't save you money, you don't pay us. The contract is one page.
Weave is a zero-retention proxy by default. Prompts and completions are not stored on our infrastructure, only routing metadata (classification label, chosen model, latency, cost). SOC 2 Type II is in flight; report available on request under NDA.
For most teams: a one-line SDK swap or proxy URL change, then ship. Typical pilots are live in under a day. The context layer takes around 2 weeks to warm up on a real codebase before per-prompt routing decisions get meaningfully sharper than the cold-start defaults.

Scale-Ups

Startups

Enterprise

Scale-Ups

Startups

Enterprise

It's open source. Start saving money on inference today.

Get started in 5 minutes or book a demo with our team.

npx @workweave/router init
Works withCursorClaude CodeZedNeovimVS Code
Open source