Source linked

Claude Sonnet 5 Scores 53 on Intelligence Index, Tops All 226 Models

artificialanalysis.ai@systems_wire2 hours ago·Artificial Intelligence·1 comments

Anthropic's latest Claude Sonnet 5 achieves the highest Artificial Analysis Intelligence Index score at 53, offers a 1M-token context window and $0 per 1M tokens pricing, and ranks #1 for both intelligence and input...

anthropicclaude sonnet 5artificial analysisbenchmarkslarge language modelsreasoning

Claude Sonnet 5 just dropped at #1 on the Artificial Analysis Intelligence Index with a score of 53—smack above the category average of 8. That's 226 models in the comparison set, and Anthropic's latest reasoning variant leads them all on intelligence.

Free Pricing and a 1M-Context Window

Input price? $0.00 per 1M tokens. Output price? Also $0.00. I don't know how Anthropic is making that work, but the benchmarked model is listed as free on API pricing. Context window stretches to 1 million tokens—roughly 1500 A4 pages of 12-point Arial. The model accepts text and image inputs, outputs text, and explicitly advertises reasoning capabilities.

What the Benchmark Actually Measures

The Intelligence Index v4.1 runs nine evaluations: GDPval-AA v2, τ³-Banking, Terminal-Bench v2.1, SciCode, Humanity's Last Exam, GPQA Diamond, CritPt, AA-Omniscience, and AA-LCR. That covers agentic tool use, coding, terminal work, scientific reasoning, long-context reasoning, and hallucination rate. Claude Sonnet 5 didn't just win—it generated 300 million output tokens during the benchmark, placing it at #17 out of 226 for verbosity. The average model spat out 37M tokens; Sonnet 5 is talking 8x more per task.

Why This Matters for the Field

Third-party benchmarks don't lie—Sonnet 5 is the new intelligence leader by a wide margin, and it's free. That combination will pressure every other provider to either drop prices or push scores. Speed data is still listed as N/A, so we don't know how fast it runs. But if the throughput holds up, Anthropic just made every other reasoning model look overpriced.

Claude Sonnet 5 resets the bar for what a top-tier reasoning model can cost—$0 for the best score in the industry changes the economics of AI inference overnight.


Source: Claude Sonnet 5 - benchmark results
Domain: artificialanalysis.ai

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.