Ant Group Unveils UI-Venus-1.5 GUI Agent
- •Ant Group releases UI-Venus-1.5, a unified GUI agent for mobile and web automation
- •Model family features dense 2B/8B versions and a 30B Mixture-of-Experts variant
- •State-of-the-art performance on AndroidWorld via multi-environment Model Merging techniques
Ant Group has introduced UI-Venus-1.5, a sophisticated Agentic AI system designed to navigate and interact with digital interfaces just like a human would. Unlike traditional models that struggle with the "reality gap" between testing environments and actual day-to-day usage, this system provides a unified approach for handling grounding, mobile, and web tasks within a single, end-to-end framework. This consolidation eliminates the need for expensive multi-agent setups, offering a faster and more reliable digital assistant.
The technical foundation of UI-Venus-1.5 rests on three specific pillars. First, a Mid-Training phase processed 10 billion tokens across 30 datasets to teach the model the nuances of graphical semantics and icon recognition. Second, researchers employed Reinforcement Learning with full-trajectory rollouts, which allows the AI to learn from the entire sequence of its actions during complex navigation tasks. Finally, the team used Model Merging to blend specialized expertise for different environments—web, mobile, and visual grounding—into one cohesive checkpoint.
The model family is versatile, offering a 30B Mixture-of-Experts (MoE) variant—which activates only specific parts of the network to save computational power—alongside smaller 2B and 8B dense versions. Extensive benchmarks show it dominating the field, particularly on AndroidWorld and ScreenSpot-Pro, where it sets new performance records. It also provides robust support for over 40 popular Chinese applications, effectively bridging the gap between theoretical research and daily utility for millions of users.