Claude
Claude

Claude Opus 4.1

Model ID:claude-opus-4-1-20250805
2025-08-05Proprietary Model
API
OverallNo.12
PopularityNo.31

Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic tasks. It achieves 74.5% on SWE-bench Verified and shows notable gains in multi-file code refactoring, debugging precision, and detail-oriented reasoning. The model supports extended thinking up to 64K tokens and is optimized for tasks involving research, data analysis, and tool-assisted reasoning.

Knowledge Cutoff
2025-01-31

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory
200KIN32KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words
$15IN$75OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

AI Performance Evaluation

Arena Overall Score
1449
±3
As of 2026-04-02
Overall Rank
No.31
50,176 Votes
Arena by Ability
Hard Prompts
1480±5No.23
Expert Knowledge
1483±12No.23
Instruction Following
1459±6No.16
Conversation Memory
1473±7No.19
Creative
1445±8No.16
Coding
1513±7No.15
Math
1443±11No.33
Arena by Occupation
Creative Writing
1445±6No.21
Social Sciences
1471±7No.28
Media
1433±7No.22
Business
1448±7No.30
Healthcare
1480±12No.23
Legal
1461±11No.27
Software
1492±5No.24
Mathematics
1450±12No.31
Reasoning Ability
AA Intelligence Index
42%↑3%
MMLU-Pro
88%↑5%
GPQA Diamond
81%↓1%
HLE
12%↓5%
Math
AA Math Index
80%↑6%
AIME 2025
80%↑6%
Coding Ability
AA Coding Index
37%↑0%
LiveCodeBench
65%↑0%
SciCode
41%↓1%
TerminalBench
34%↑0%
Instruction Following
IFBench
55%↓2%
환각률 (HHEM)
12%↑1%
사실 일관성 (HHEM)
88%↓1%
Long Context
AA-LCR
66%↑3%
Agentic AI Ability
TAU2
71%↑0%
Speed
Standard Mode
34tok/sec↓44
First Output 1.33s
Artificial Analysis
Reasoning Mode
35tok/sec↓38
First Output 8.29s
Artificial Analysis