Google DeepMind has unveiled Gemini Robotics On-Device, a groundbreaking language model designed to operate locally on robots without an internet connection. This advancement significantly enhances AI capabilities for robotics, allowing for more autonomous and responsive robotic systems in various applications, from industrial tasks to everyday assistance.
Gemini Robotics On-Device: A Leap Forward
Google DeepMind's latest innovation, Gemini Robotics On-Device, builds upon its predecessor, the cloud-based Gemini Robotics model. This new iteration empowers robots to perform tasks and control movements directly on the device, eliminating the need for constant internet connectivity. Developers can fine-tune the model using natural language prompts, offering unprecedented flexibility.
Key Takeaways
Local Operation: The model runs directly on robots, removing reliance on internet connectivity.
Enhanced Performance: Benchmarks indicate performance comparable to cloud-based models and superior to other on-device models.
Natural Language Control: Developers can use natural language to control and fine-tune robot tasks.
Versatile Application: Demonstrated capabilities include unzipping bags, folding clothes, and industrial assembly.
Adaptability: Initially trained for ALOHA robots, it has been successfully adapted for bi-arm Franka FR3 and Apollo humanoid robots.
Developer Support: Google DeepMind is releasing a Gemini Robotics SDK to facilitate training and development.
Real-World Demonstrations and Adaptability
Google showcased the model's capabilities through various demonstrations, including robots unzipping bags and folding clothes. Notably, the model, initially trained for ALOHA robots, has been successfully adapted to control a bi-arm Franka FR3 robot and Apptronik's Apollo humanoid robot. The Franka FR3 demonstrated proficiency in handling unfamiliar scenarios, such as assembly on an industrial belt, highlighting the model's robust adaptability.
Empowering Developers with the Gemini Robotics SDK
To further accelerate development in robotics, Google DeepMind is releasing a Gemini Robotics SDK. This toolkit will enable developers to train robots on new tasks by providing 50 to 100 demonstrations within the MuJoCo physics simulator. This approach simplifies the process of teaching complex behaviours to robotic systems.

Broader Impact on Robotics and AI
Google's move into on-device AI for robotics reflects a growing trend within the industry. Other major players like Nvidia are developing platforms for humanoid foundation models, while Hugging Face is contributing open models and datasets. Korean startup RLWRLD is also working on foundational models for robots, indicating a collective push towards more intelligent and autonomous robotic solutions. This convergence of AI and robotics promises a future where robots are more integrated into daily life and industrial processes, operating with greater independence and intelligence.