What are the key points?

Google DeepMind upgrades Gemini 3 Deep Think to solve complex scientific and engineering challenges. Model achieves 84.6% on ARC-AGI-2 and elite Elo ratings on the Codeforces programming benchmark. Specialized reasoning mode now available via Gemini API for researchers and enterprise partners.

Google DeepMind Updates Gemini 3 Deep Think for Science

•Google DeepMind upgrades Gemini 3 Deep Think to solve complex scientific and engineering challenges.
•Model achieves 84.6% on ARC-AGI-2 and elite Elo ratings on the Codeforces programming benchmark.
•Specialized reasoning mode now available via Gemini API for researchers and enterprise partners.

Google DeepMind has unveiled a significant upgrade to Gemini 3 Deep Think, a specialized version of its flagship model designed for high-level reasoning. Unlike general-purpose assistants, this iteration is specifically tuned for the messy reality of scientific research, where data is often incomplete and solutions aren't always binary. By prioritizing mathematical and algorithmic rigor, the model aims to bridge the gap between theoretical exploration and practical engineering applications.

The performance metrics are particularly striking. In the realm of competitive programming, it reached an Elo rating of 3455 on Codeforces, placing it among the world's elite human coders. It also set a new standard on Humanity’s Last Exam—a test specifically designed to be difficult even for experts—scoring 48.4% without the aid of external tools. Perhaps most impressively, it verified its utility by identifying a subtle logical flaw in a technical physics paper that had previously escaped human peer reviewers during the standard publication process.

Beyond academic benchmarks, the update emphasizes real-world utility in materials science and mechanical engineering. For instance, researchers at Duke University utilized the model to optimize semiconductor fabrication, while other engineering teams used it to convert 2D sketches into complex, 3D-printable models. This update signals a shift from AI as a conversational partner to AI as a rigorous collaborator in the lab, now accessible to both enterprise partners via the Gemini API and individual Google AI Ultra subscribers.

Google DeepMind has unveiled a significant upgrade to Gemini 3 Deep Think, a specialized version of its flagship model designed for high-level reasoning. Unlike general-purpose assistants, this iteration is specifically tuned for the messy reality of scientific research, where data is often incomplete and solutions aren't always binary. By prioritizing mathematical and algorithmic rigor, the model aims to bridge the gap between theoretical exploration and practical engineering applications.

The performance metrics are particularly striking. In the realm of competitive programming, it reached an Elo rating of 3455 on Codeforces, placing it among the world's elite human coders. It also set a new standard on Humanity’s Last Exam—a test specifically designed to be difficult even for experts—scoring 48.4% without the aid of external tools. Perhaps most impressively, it verified its utility by identifying a subtle logical flaw in a technical physics paper that had previously escaped human peer reviewers during the standard publication process.

Beyond academic benchmarks, the update emphasizes real-world utility in materials science and mechanical engineering. For instance, researchers at Duke University utilized the model to optimize semiconductor fabrication, while other engineering teams used it to convert 2D sketches into complex, 3D-printable models. This update signals a shift from AI as a conversational partner to AI as a rigorous collaborator in the lab, now accessible to both enterprise partners via the Gemini API and individual Google AI Ultra subscribers.

Google DeepMind Updates Gemini 3 Deep Think for Science

Tags