Weave - AI to measure engineers and agents

Stop paying OpenAI and Anthropic for thinking your code doesn't need

Weave routes every prompt to the cheapest model that can actually do the job. Same quality, 30-60% lower spend.

Book a Demo

Get Started for Free

Trusted by leading engineering teams

Scale-Ups

Startups

Enterprise

Frequently asked questions

Classification runs in under 50ms on a small distilled model that lives in our edge layer. The first token from the picked model still streams in the same envelope it would have without Weave, usually faster, because we route past the most-congested providers when we can.

Anthropic (Opus, Sonnet, Haiku), OpenAI (GPT-5, Codex, mini), Google (Gemini 2.5 Pro/Flash), Moonshot (Kimi K2), DeepSeek V3, Mistral, Llama 4, plus any local or self-hosted endpoint that speaks the OpenAI HTTP shape. Weave is model-agnostic by design.

We take a percentage of measured savings against your pre-Weave baseline. No flat fee, no minimum, no per-seat math. If we don't save you money, you don't pay us. The contract is one page.

Weave is a zero-retention proxy by default. Prompts and completions are not stored on our infrastructure, only routing metadata (classification label, chosen model, latency, cost). SOC 2 Type II is in flight; report available on request under NDA.

For most teams: a one-line SDK swap or proxy URL change, then ship. Typical pilots are live in under a day. The context layer takes around 2 weeks to warm up on a real codebase before per-prompt routing decisions get meaningfully sharper than the cold-start defaults.

Scale-Ups