EgoPush Enables Robots to Rearrange Cluttered Objects
- •EgoPush framework allows mobile robots to rearrange cluttered objects using only a single egocentric camera
- •System utilizes object-centric latent spaces to encode spatial relationships without needing global coordinate mapping
- •Researchers successfully demonstrated zero-shot sim-to-real transfer, moving the AI directly from simulation to physical robots
Rearranging objects in a messy room is a trivial task for humans but a nightmare for robots, who typically rely on complex global maps that break down in dynamic settings.
Researchers have introduced EgoPush, an end-to-end framework that skips the need for absolute coordinates. Instead, it uses an "object-centric latent space," which essentially allows the robot to understand how objects relate to each other rather than where they are in a fixed room layout. This relative understanding makes the robot much more adaptable to moving obstacles or shifting environments.
The training process involves a "teacher" model with full knowledge that distills its wisdom into a "student" model that only sees what the robot’s camera sees. To handle long, complex tasks, the team used stage-local rewards—breaking a big job into smaller, manageable goals that provide feedback as each step is completed.
Most impressively, the system achieved zero-shot sim-to-real transfer. This means the AI was trained entirely in a digital simulation and then successfully operated a physical mobile robot in the real world without any additional tuning. This leap bridges a major gap in making useful home and warehouse robots a reality.