What are the key points?

Claude Opus 4.5 secured the top coding rank by generating a fully functional Tetris game from a single prompt. Despite its premium pricing, GPT-5.2 Pro demonstrated lower cost-efficiency due to unexpected errors in basic coding tasks. China's DeepSeek V3.2 offered extreme affordability but lacked the refinement necessary for complex, single-shot development.

Claude Opus 4.5 Outperforms Rivals in Tetris Coding Challenge

•Claude Opus 4.5 secured the top coding rank by generating a fully functional Tetris game from a single prompt.
•Despite its premium pricing, GPT-5.2 Pro demonstrated lower cost-efficiency due to unexpected errors in basic coding tasks.
•China's DeepSeek V3.2 offered extreme affordability but lacked the refinement necessary for complex, single-shot development.

As AI technology advances, developers are seeking empirical evidence to determine which models offer the best practical value. A recent experiment compared Anthropic's Claude Opus 4.5, OpenAI's GPT-5.2 Pro, and China's DeepSeek V3.2 by challenging them to program a functional Tetris game. Claude Opus 4.5 emerged as the clear winner, delivering a polished and playable game within just two minutes. Its success highlights an ability to interpret complex instructions and execute them with sophisticated visuals and fluid mechanics from a single prompt.

In contrast, OpenAI’s premium GPT-5.2 Pro model delivered underwhelming results despite its significantly higher usage costs. The model struggled with basic rendering and block movement mechanics, producing code with multiple bugs that required manual intervention. While GPT-5.2 Pro remains a powerful tool for high-level logical reasoning, its cost-efficiency in general software development appears lower than Claude's. This performance gap suggests that a higher price point does not always guarantee superior performance in specialized technical domains like game development.

DeepSeek V3.2 represented the budget-friendly alternative, though its output proved unplayable due to critical errors like disappearing blocks. While its low token cost—the unit used for calculating AI usage fees—makes it attractive for iterative tasks, it lacks the precision required for high-stakes, one-shot coding. Developers must carefully analyze trade-offs between model strengths and billing expenses to optimize their specific workflows. This experiment underscores that the most effective AI tool is determined by the specific project requirements rather than brand prestige or cost alone.

As AI technology advances, developers are seeking empirical evidence to determine which models offer the best practical value. A recent experiment compared Anthropic's Claude Opus 4.5, OpenAI's GPT-5.2 Pro, and China's DeepSeek V3.2 by challenging them to program a functional Tetris game. Claude Opus 4.5 emerged as the clear winner, delivering a polished and playable game within just two minutes. Its success highlights an ability to interpret complex instructions and execute them with sophisticated visuals and fluid mechanics from a single prompt.

In contrast, OpenAI’s premium GPT-5.2 Pro model delivered underwhelming results despite its significantly higher usage costs. The model struggled with basic rendering and block movement mechanics, producing code with multiple bugs that required manual intervention. While GPT-5.2 Pro remains a powerful tool for high-level logical reasoning, its cost-efficiency in general software development appears lower than Claude's. This performance gap suggests that a higher price point does not always guarantee superior performance in specialized technical domains like game development.

DeepSeek V3.2 represented the budget-friendly alternative, though its output proved unplayable due to critical errors like disappearing blocks. While its low token cost—the unit used for calculating AI usage fees—makes it attractive for iterative tasks, it lacks the precision required for high-stakes, one-shot coding. Developers must carefully analyze trade-offs between model strengths and billing expenses to optimize their specific workflows. This experiment underscores that the most effective AI tool is determined by the specific project requirements rather than brand prestige or cost alone.

Claude Opus 4.5 Outperforms Rivals in Tetris Coding Challenge

Tags