Anthropic ships frontier-class coding at commodity prices

Avatar

Haiku 4.5 matches May’s best at one-third the cost, pushing multi-agent systems over the economic line.

Anthropic released Claude Haiku 4.5 today, pricing it at $1 per million input tokens and $5 per million output tokens — roughly a third of Sonnet 4.5’s $3/$15. The company says Haiku 4.5 delivers coding and “computer use” performance comparable to Sonnet 4, which set the bar in May. It’s available immediately on Claude.ai, via Anthropic’s API, and through Amazon Bedrock and Google Cloud Vertex AI. For free users, Anthropic is making Haiku 4.5 the default in some cases, while keeping manual selection available to everyone.

Sonnet 4.5 shipped two weeks ago; Sonnet 4 arrived five months back. Anthropic now argues that last spring’s frontier capability has been compressed into a faster, cheaper small model. Haiku 4.5 runs more than twice as fast as Sonnet 4 on key tasks, with similar results on coding, computer use, and tool-calling. That speed changes what developers can afford to run in parallel. It also shifts where the margins live.

The delta isn’t capability; it’s cost at capability. Sonnet 4 was frontier-class in May. Five months later, Haiku 4.5 offers that intelligence band at a fraction of the price and latency. It’s also the first Haiku with hybrid reasoning and an optional “extended thinking” mode — previously a big-model perk. In Anthropic’s internal testing, Haiku 4.5 scored 73.3% on SWE-bench Verified, averaged over 50 trials with a 128K thinking budget and default sampling, plus a prompt addendum that nudges heavy tool use and test writing. Note the test conditions. They matter.

There’s a wrinkle on price. Haiku 4.5 costs more than Haiku 3.5’s $0.80/$4, even as Anthropic markets it as the economical choice. For teams moving up from Haiku 3.5, this is a capability upgrade that still invites a budget talk. For teams choosing between models, Haiku 4.5 undercuts Sonnet 4 by about 67% while matching its May-era coding performance. That’s the trade.

Anthropic’s architecture story centers on multi-agent systems: let Sonnet 4.5 plan, then spin up a pool of Haiku 4.5 workers to execute subtasks in parallel — refactors, migrations, data gathering, and more. The pattern wasn’t technically new. It was economically out of reach. Running multiple frontier instances in lockstep shreds API budgets and blows latency targets.

Haiku 4.5 changes that calculus. It is fast, inexpensive, and capable enough to make parallel execution viable for production. In practice, that unlocks customer-support agents, pair-programming loops, and real-time assistants where throughput and responsiveness dominate. For deep planning, Sonnet 4.5 remains necessary. The sweet spot is orchestration: Sonnet breaks the problem; Haiku handles the work. Anthropic is baking that pattern into Claude Code. It’s a portfolio play that sells both models without forcing a binary choice. Smart.

Competitors are chasing the same curve. OpenAI is pushing small models toward GPT-5-class targets, and Google’s Gemini 2.5 Flash aims at low-latency, cost-sensitive jobs. The pattern holds: frontier capability migrates down to smaller models in months. The only open question is whether the frontier advances faster than the cost curve compresses. Right now, compression is winning.

Anthropic’s neat tiering — Opus for reasoning, Sonnet for balance, Haiku for speed — is blurring. Opus 4.1, launched in August, now sits in Claude.ai as a “legacy brainstorming model.” Internally, the recommendation has shifted to Sonnet 4.5 for most tasks, with Haiku 4.5 as the production workhorse. The lines aren’t just fuzzy; they’re being redrawn mid-generation.

That shift reflects product reality. Maintaining three equally prominent tiers while rivals ship quickly spreads attention thin. Consolidating around a frontier tier (Sonnet) and a production tier (Haiku) simplifies the story and positions Opus as optional rather than central. It also raises an obvious question: if an Opus 4.5 lands, where does it live in this new map?

Safety adds another twist. Haiku 4.5 is rated at AI Safety Level 2 (ASL-2), a notch less restrictive than Sonnet 4.5 and Opus 4.1 at ASL-3. Anthropic’s internal testing reports lower rates of misaligned behavior for Haiku 4.5 than for its larger siblings, and “limited risks” on CBRN-related assessments. In other words, the smallest, fastest, cheapest model is — by Anthropic’s own metrics — its safest. That’s notable.

Inference — not training — is where costs scale. Every query, every user, forever. Smaller models consume far less energy per response than giant ones, and those physics drive margins. Illustrative figures cited in coverage show a 405-billion-parameter model drawing thousands of joules per query while an eight-billion-parameter model draws a tiny fraction of that. The exact numbers vary by setup, but the direction is clear. Size multiplies cost.

That’s why small models matter. AI companies are plotting vast data-center buildouts over the next two years. Those plans pencil out only if inference costs fall fast enough to support mass deployment. Haiku 4.5’s economics — and rival efforts to match it — suggest the industry expects “good enough, fast” models to carry the volume. Anthropic even markets Haiku 4.5 for free-tier experiences, where cost per user must approach zero. That’s where scale lives.

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post

Trump’s 100% Tariffs On China Send Bitcoin, Ethereum, And Dogecoin Price Crashing. But One Coin Stands Out – Cryptopolitan

Next Post

How are Panama’s fashion sectors affected by Bitcoin?

Related Posts