What are the key points?

Meituan’s EvoCUA achieves 56.7% OSWorld success rate, setting new open-source record for AI Agent performance Autonomous Synthetic Data generation and asynchronous Sandboxing rollouts overcome existing data scaling bottlenecks Evolutionary learning strategy outperforms leading closed-source models while maintaining high parameter efficiency

EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience

•Meituan’s EvoCUA achieves 56.7% OSWorld success rate, setting new open-source record for AI Agent performance
•Autonomous Synthetic Data generation and asynchronous Sandboxing rollouts overcome existing data scaling bottlenecks
•Evolutionary learning strategy outperforms leading closed-source models while maintaining high parameter efficiency

The pursuit of agents capable of navigating a computer interface like a human has long been hindered by "data scarcity"—the lack of high-quality examples showing how to solve complex, multi-step digital tasks. Meituan’s LongCat Team has introduced EvoCUA, an open-source AI Agent that breaks this performance plateau by shifting from passive learning to a self-sustaining evolutionary cycle.

At the heart of EvoCUA is a verifiable synthesis engine that autonomously creates diverse digital tasks and their corresponding validators. This allows the system to generate its own training ground rather than relying on limited human-labeled datasets. To process this vast amount of experience, the researchers built a massive infrastructure capable of running tens of thousands of simultaneous Sandboxing simulations, where the agent "practices" navigating operating systems and applications like Excel or VSCode.

What makes EvoCUA truly distinctive is its iterative evolving learning strategy. Instead of just copying successful actions, the model analyzes its own failures through error analysis and self-correction. By identifying where its capabilities currently end, it transforms unsuccessful attempts into rich supervision. This approach propelled the 32B version of EvoCUA to a 56.7% success rate on the OSWorld benchmark, notably surpassing leading closed-source competitors and establishing a new frontier for open-source Multimodal AI.

The pursuit of agents capable of navigating a computer interface like a human has long been hindered by "data scarcity"—the lack of high-quality examples showing how to solve complex, multi-step digital tasks. Meituan’s LongCat Team has introduced EvoCUA, an open-source AI Agent that breaks this performance plateau by shifting from passive learning to a self-sustaining evolutionary cycle.

At the heart of EvoCUA is a verifiable synthesis engine that autonomously creates diverse digital tasks and their corresponding validators. This allows the system to generate its own training ground rather than relying on limited human-labeled datasets. To process this vast amount of experience, the researchers built a massive infrastructure capable of running tens of thousands of simultaneous Sandboxing simulations, where the agent "practices" navigating operating systems and applications like Excel or VSCode.

What makes EvoCUA truly distinctive is its iterative evolving learning strategy. Instead of just copying successful actions, the model analyzes its own failures through error analysis and self-correction. By identifying where its capabilities currently end, it transforms unsuccessful attempts into rich supervision. This approach propelled the 32B version of EvoCUA to a 56.7% success rate on the OSWorld benchmark, notably surpassing leading closed-source competitors and establishing a new frontier for open-source Multimodal AI.

EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience

Tags