Google DeepMind 发布 Gemini Robotics-ER 1.6,提升机器人规划与感知精度
阅读原文· the-decoder.comGoogle DeepMind 推出 Gemini Robotics-ER 1.6,显著提升机器人的规划与执行精度。新版本增强了环境感知与理解能力,新增识别并读取测量仪器的功能,使机器人能够在复杂任务场景中实现更精准的操作决策与行动控制。
Google Deepmind's Gemini Robotics-ER 1.6 gives robots a sharper brain for planning and perception
Google Deepmind has released Gemini Robotics-ER 1.6, an upgraded model for embodied reasoning in robots. It acts as a high-level thinking layer that helps robots understand their surroundings and plan tasks on their own, tapping tools like Google Search or vision-language-action models when needed. Deepmind says the new version beats both Gemini Robotics-ER 1.5 and Gemini 3.0 Flash at pointing to objects, counting, and recognizing successful task execution.
Reading instruments like pressure gauges and sight glasses, a capability developed with Boston Dynamics, has also seen a major boost. The model pairs agentic image processing with code execution: it zooms in to catch small display details, uses pointing functions and code to calculate proportions and scale distances, then applies world knowledge to interpret the reading. Boston Dynamics' Spot robot reportedly uses the feature for system inspections.