# Connect the Dots：通过强化学习训练大语言模型实现跨域泛化的长期生命周期智能体

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-18 08:00
- AIHOT 分数：46
- AIHOT 链接：https://aihot.virxact.com/items/cmqq2kjqi060islp5mee0js28
- 原文链接：https://arxiv.org/abs/2606.20002

## AI 摘要

Connect the Dots（CoD）是一个训练大语言模型实现长期生命周期智能体的通用框架。它让LLM在部署后持续探索环境、从自身经验中学习并迭代更新上下文，从而在后续任务中表现更优。框架包括端到端强化学习训练算法与基础设施，采用GRPO风格RL和细粒度信用分配。实验表明，端到端RL训练有效，且激发的元能力具备训练域内、跨域以及从CoD到Ralph-loop设定的分布外泛化潜力。实现已开源。

## 正文

This work presents a general framework for training large language models (LLMs) to "Connect the Dots" (CoD), a meta-capability required by long-lifecycle agents: as an LLM-based AI agent gets deployed in an environment, it solves a long sequence of tasks while continuously exploring the environment, learning from its own experiences, and iteratively self-updating its context about the environment, thereby achieving progressively better performance on future tasks conditioned on the updated context. Major components of the CoD framework include: (1) algorithm design and infrastructure for end-to-end reinforcement learning (RL) with long rollout sequences interleaving solve-task and update-context episodes; (2) tasks and environments for incentivizing and eliciting the targeted meta-capability in LLMs during training, as well as for faithfully measuring progress during evaluation. We present proof-of-concept implementations of the CoD framework, including a GRPO-style RL algorithm with fine-grained credit assignment, as well as tasks and environments tailored to the targeted meta-capability (rather than domain-specific LLM capabilities or standard task-by-task RL). Empirical results validate the efficacy of end-to-end RL training in the CoD setting, and demonstrate the potential for out-of-distribution generalization -- within the training domains, across different domains, and from CoD to Ralph-loop settings -- of the elicited meta-capability. Our investigation of CoD connects several lines of prior works, and opens up new opportunities for advancing LLMs and AI agents. To facilitate further research and applications, we release our implementations at https://github.com/agentscope-ai/Trinity-RFT/tree/research/cod/examples/research_cod.