Anthropic
Claude Opus 4
Model IDclaude-opus-4-20250514
2025-05-22Proprietary Model
Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in software engineering, achieving leading results on SWE-bench (72.5%) and Terminal-bench (43.2%). Opus 4 supports extended, agentic workflows, handling thousands of task steps continuously for hours without degradation.
API
Knowledge Cutoff
2025-01-31
The date this AI finished learning. It may not know about things that happened after this date.
Input → Output Format
The types of content this AI can receive, and what it can produce in return.
Context Memory
200KIN32KOUT
The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.
AI Performance Evaluation
Arena Overall Score
1424
±4As of 2026-04-07
Overall Rank
No.57
37,201 Votes
Arena by Ability
Hard Prompts
1456±6No.44
Expert Knowledge
1448±14No.55
Instruction Following
1442±7No.28
Conversation Memory
1437±8No.48
Creative
1431±9No.26
Coding
1498±8No.30
Math
1418±12No.61
Arena by Occupation
Creative Writing
1429±7No.32
Social Sciences
1440±8No.61
Media
1420±8No.33
Business
1412±8No.71
Healthcare
1447±13No.58
Legal
1435±12No.57
Software
1466±6No.45
Mathematics
1423±13No.62
Source:Arena Intelligence
Overall
AA Intelligence Index
39%↑1%
ForecastBench
61%↑1%
Reasoning & Math
AA Math Index
73%↑0%
GPQA Diamond
80%↓1%
HLE
12%↓5%
MMLU-Pro
87%↑6%
AIME 2025
73%↑0%
MATH-500
98%↑5%
Coding
AA Coding Index
34%↑0%
LiveCodeBench
64%↓1%
TAU2
73%↑2%
TerminalBench
31%↑0%
SciCode
40%↓1%
Language & Instructions
IFBench
54%↓2%
AA-LCR
34%↓28%
Hallucination (HHEM)
12%↑1%
Factual (HHEM)
88%↓1%
Output Speed
Standard Mode
34tok/s↓45
First Output 1.33s
Reasoning Mode
36tok/s↓95
First Output 7.18s