Meta Harnesses是由斯坦福与DSPy作者提出的自动化框架生成技术,通过自动生成单文件Python程序(harness)来优化特定任务的提示词、检索与编排逻辑,实现无需人工干预的持续迭代。相比Autoresearch,其抽象层级更高,适用于结果可验证的特定领域任务(如数学推理、编程),能自动将问题分类并制定差异化策略,但在需要统一方法论的任务上存在局限。
We are excited to share a new paper solving three further problems due to Erdős; in each case the solution was found by ...
Our recent findings on World Action Models (WAMs): the core advantage of WAMs is not test-time "imagination" of futures,...
Introducing EgoVerse: an ecosystem for robot learning from egocentric human data. Built and tested by 4 research labs + ...
Happy to share new progress in AI for Maths @GoogleDeepMind . In extremal combinatorics, AlphaEvolve has helped establis...
Google Research、NHS 与 Imperial College 合作发表于 Nature Cancer 的研究表明,AI 系统在乳腺癌筛查中可检出 25% 传统方法遗漏的 interval cancers。该技术在显著提高敏感性(真阳性检出率)的同时,未对特异性(假阳性)产生明显影响,还能减少医护人员筛查工作量,并加速向医生和患者反馈诊断结果。
Breast cancer affects one in every eight women in the UK, and early detection is crucial. ⚕️ Our latest research in @Nat...
Train Beyond Language. We bet on the visual world as the critical next step alongside and beyond language modeling. So, ...
研究团队提出EgoScale方法,基于20,000小时第一人称人类视频预训练GR00T N1.5,仅用4小时机器人数据即可掌握组装模型车、操作注射器等高灵巧度任务,性能较从头训练提升54%。研究发现人类视频量与动作预测损失呈对数线性缩放关系(R²=0.998)。该方法利用22-DoF手部与人类的运动学相似性,无需复杂迁移算法即可重定向动作。策略可跨硬件迁移至Unitree G1(7-DoF),性能提升30%以上,且仅需单个示教即可学习新任务。
关联讨论 1 条X:Jim Fan (@DrJimFan)团队发布DreamZero,首个基于世界模型骨干的World Action Model (WAM)。该模型突破传统Vision-Language-Action范式,通过像素级世界模型实现零样本开放世界提示能力,可执行未训练过的新任务。研究发现WAM依赖多样化数据而非重复演示,并以像素作为跨具身的通用桥梁,实现robot2robot和human2robot知识迁移。仅需55条轨迹(约30分钟遥操作)即可适应全新硬件,验证世界模型作为Physical AI下一代基础的可行性。
What if your video generator could refine itself-at inference time? ❌No new models. ❌No retraining. ❌No external verifie...
Last October, we introduced Representation Autoencoders (RAE), showing that training diffusion on frozen semantic repres...
!!️ Representations matter for generation! But turns out our understanding of how representations help generation was wr...
H*项目突破传统MLLMs处理单一2D图像的局限,引入全景图像作为环境载体,使模型具备在360度真实空间中主动观察与推理的能力。相比V*等项目的局部视觉工具,H*通过"具身化"范式赋予模型类似人类颈部的视角自由度,显著扩展了行动空间,支持在地铁站、商场等复杂场景中进行视觉搜索与空间推理,实现了从被动接受到主动探索的范式转变。
🤔Visual-spatial reasoning requires a shift from a disembodied, passive paradigm to an embodied, active one: 🤖Grounding...
Our latest post explores on-policy distillation, a training approach that unites the error-correcting relevance of RL wi...
Today we describe how we leverage AlphaEvolve, a @GoogleDeepMind system for iteratively evolving code, to morph snippets...
Efficient training of neural networks is difficult. Our second Connectionism post introduces Modular Manifolds, a theore...
Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is "Defeating Nondetermin...