Claude

Claude Sonnet 4

Model ID:claude-sonnet-4-20250514

2025-05-22Proprietary Model

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%), Sonnet 4 balances capability and computational efficiency, making it suitable for a broad range of applications from routine coding tasks to complex software development projects. Key enhancements include improved autonomous codebase navigation, reduced error rates in agent-driven workflows, and increased reliability in following intricate instructions. Sonnet 4 is optimized for practical everyday use, providing advanced reasoning capabilities while maintaining efficiency and responsiveness in diverse internal and external scenarios.

API

Knowledge Cutoff

2025-01-31

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

200KIN64KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$3IN$15OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1399

±4

As of 2026-04-02

Overall Rank

No.94

35,417 Votes

Arena by Ability

Hard Prompts

1431±6No.75

Expert Knowledge

1435±14No.66

Instruction Following

1414±7No.58

Conversation Memory

1419±8No.66

Creative

1395±9No.59

Coding

1473±8No.50

Math

1402±13No.88

Arena by Occupation

Creative Writing

1397±7No.68

Social Sciences

1418±8No.87

Media

1388±8No.66

Business

1385±8No.107

Healthcare

1420±13No.95

Legal

1409±13No.87

Software

1443±6No.77

Mathematics

1409±13No.91

Source:Arena Intelligence

Overall

AA Intelligence Index

39%↑0%

ForecastBench

59%↓1%

Reasoning & Math

AA Math Index

74%↑0%

GPQA Diamond

78%↓4%

HLE

9.6%↓8%

MMLU-Pro

84%↑2%

AIME 2025

74%↑0%

MATH-500

99%↑5%

Coding

AA Coding Index

34%↓2%

LiveCodeBench

66%↑0%

TAU2

65%↓5%

TerminalBench

31%↓3%

SciCode

40%↓2%

Language & Instructions

IFBench

55%↓3%

AA-LCR

65%↑1%

Hallucination (HHEM)

10%↑0%

Factual (HHEM)

90%↑0%

Output Speed

Standard Mode

45tok/s↓37

First Output 0.80s

Reasoning Mode

48tok/s↓61

First Output 8.52s

Source:Artificial Analysis ForecastBench Vectara HHEM

← Back to AI Models