Claude
Claude

Claude Sonnet 4

Model ID:claude-sonnet-4-20250514
2025-05-22Proprietary Model

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%), Sonnet 4 balances capability and computational efficiency, making it suitable for a broad range of applications from routine coding tasks to complex software development projects. Key enhancements include improved autonomous codebase navigation, reduced error rates in agent-driven workflows, and increased reliability in following intricate instructions. Sonnet 4 is optimized for practical everyday use, providing advanced reasoning capabilities while maintaining efficiency and responsiveness in diverse internal and external scenarios.

API
Knowledge Cutoff
2025-01-31
Input → Output Format
Context Memory
200KIN64KOUT
Cost/1M Words
$3IN$15OUT

AI Performance Evaluation

Arena Overall Score
1399
±4
As of 2026-04-02
Overall Rank
No.94
35,417 Votes
Arena by Ability
Hard Prompts
1431±6No.75
Expert Knowledge
1435±14No.66
Instruction Following
1414±7No.58
Conversation Memory
1419±8No.66
Creative
1395±9No.59
Coding
1473±8No.50
Math
1402±13No.88
Arena by Occupation
Creative Writing
1397±7No.68
Social Sciences
1418±8No.87
Media
1388±8No.66
Business
1385±8No.107
Healthcare
1420±13No.95
Legal
1409±13No.87
Software
1443±6No.77
Mathematics
1409±13No.91
Overall
AA Intelligence Index
39%↑0%
ForecastBench
59%↓1%
Reasoning & Math
AA Math Index
74%↑0%
GPQA Diamond
78%↓4%
HLE
9.6%↓8%
MMLU-Pro
84%↑2%
AIME 2025
74%↑0%
MATH-500
99%↑5%
Coding
AA Coding Index
34%↓2%
LiveCodeBench
66%↑0%
TAU2
65%↓5%
TerminalBench
31%↓3%
SciCode
40%↓2%
Language & Instructions
IFBench
55%↓3%
AA-LCR
65%↑1%
Hallucination (HHEM)
10%↑0%
Factual (HHEM)
90%↑0%
Output Speed
Standard Mode
45tok/s↓37
First Output 0.80s
Reasoning Mode
48tok/s↓61
First Output 8.52s