Introducing Max Today we are releasing Max, Arena's model router powered by our community’s 5+ million real-world votes. Max acts as an intelligent orchestrator—it routes each user prompt to the most capable model for that specific prompt. ARENA TEAM 04 FEB 2026
- •Arena launches Max, an intelligent router achieving the #1 rank on the Arena Overall leaderboard
- •System leverages 5M+ human votes to direct prompts to the most capable specialized models
- •Latency-aware version arcstride reduces time-to-first-token by 16 seconds while maintaining top-tier performance
Arena Intelligence has unveiled Max, a sophisticated model router designed to navigate the increasingly fragmented landscape of artificial intelligence. Instead of relying on a single Large Language Model, Max functions as an intelligent traffic controller (Orchestrator), analyzing each incoming user request to determine which specific model—be it from Google, Anthropic, or xAI—is best suited for the task. This approach leverages the distinct strengths of various AI architectures, ensuring that a coding challenge goes to a specialized model while a creative writing prompt is handled by a more imaginative counterpart.
The system’s effectiveness is rooted in over five million real-world human preferences gathered through the Arena community. In recent evaluations, the base version of Max, codenamed theta-hat, secured the top position on the Arena Overall leaderboard, outperforming industry giants like Gemini 3 Pro and Claude 4.5. By dynamically switching between models, Max creates a unified interface that effectively harvests the collective intelligence of the entire frontier model ecosystem.
Addressing the common trade-off between power and speed, Arena also introduced a latency-aware variant named arcstride. This version is specifically optimized to reduce the delay before the first piece of text appears (Time to First Token (TTFT)). Impressively, arcstride managed to shave off 16 seconds of latency compared to its nearest competitors while remaining on the Pareto frontier—a mathematical sweet spot where the model achieves the highest possible performance for a given level of speed. This ensures users receive expert-level answers without the frustrating wait times often associated with massive thinking models.