ChatGPT

GPT-5

Model ID:gpt-5-2025-08-07

2025-08-07Proprietary Model

API

OverallNo.30

PopularityNo.45

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy in high-stakes use cases. It supports test-time routing features and advanced prompt understanding, including user-specified intent like "think hard about this." Improvements include reductions in hallucination, sycophancy, and better performance in coding, writing, and health-related tasks.

Knowledge Cutoff

2024-09-30

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

400KIN128KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$1.25IN$10OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1434

±5

As of 2026-04-02

Overall Rank

No.45

32,245 Votes

Arena by Ability

Hard Prompts

1446±6No.52

Expert Knowledge

1459±16No.39

Instruction Following

1409±7No.63

Conversation Memory

1420±9No.64

Creative

1375±10No.86

Coding

1468±8No.57

Math

1433±14No.39

Arena by Occupation

Creative Writing

1397±8No.67

Social Sciences

1444±9No.54

Media

1397±8No.51

Business

1414±9No.65

Healthcare

1456±15No.47

Legal

1454±14No.36

Software

1453±7No.69

Mathematics

1442±14No.41

Source:Arena Intelligence

Reasoning Ability

AA Intelligence Index

22%↓17%

MMLU-Pro

82%↓1%

GPQA Diamond

69%↓13%

HLE

5.8%↓11%

Math

AA Math Index

48%↓26%

AIME 2025

48%↓26%

Coding Ability

AA Coding Index

21%↓15%

LiveCodeBench

54%↓11%

SciCode

38%↓4%

TerminalBench

13%↓21%

Instruction Following

IFBench

45%↓12%

환각률 (HHEM)

15%↑4%

사실 일관성 (HHEM)

85%↓4%

Long Context

AA-LCR

64%↑0%

Agentic AI Ability

TAU2

0.0%↓71%

Speed

Standard Mode

77tok/sec↓1

First Output 1.03s

Artificial Analysis

Reasoning Mode

85tok/sec↑13

First Output 41.72s

Artificial Analysis

Source:Artificial Analysis Vectara HHEM

← Back to AI Models