# 协作并行思考：面向高效测试时缩放的协作并行思考框架

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-05-26 08:00
- AIHOT 分数：59
- AIHOT 链接：https://aihot.virxact.com/items/cmpnqh7000zw1sl01h6nl7ntg
- 原文链接：https://arxiv.org/abs/2605.27030

## AI 摘要

为解决大语言模型并行测试时缩放（TTS）中各分支信息隔离导致的重复探索问题，研究提出了协作并行思考（CPT）框架。该框架无需训练，可在推理时跨并行分支共享中间发现：它从各分支提取紧凑信息，维护一个去重的查询级信息池，并通过输入上下文广播信息，使后续分支能复用已有发现。在 HMMT 和 AIME 基准上的实验表明，CPT 在不同预算和模型规模下，均比强基线方法建立了更好的准确率-延迟帕累托前沿，验证了搜索时协作是实现高效并行 TTS 的有效方向。

## 正文

Test-Time Scaling (TTS) enhances the reasoning capabilities of large language models by allocating additional inference compute to explore the solution space. However, existing parallel TTS methods typically keep branches isolated during search: intermediate discoveries remain branch-private and cannot guide other branches in time. This information isolation causes substantial redundant exploration, as branches repeatedly rediscover information already found elsewhere and require more search steps to collect complete decision information needed to reach correct answers. To bridge this gap, we propose Collaborative Parallel Thinking (CPT), a training-free inference framework that enables search-time information sharing across parallel branches. CPT extracts compact intermediate information from ongoing branches, maintains a deduplicated query-level information pool, and broadcasts pool entries through the input context, allowing each branch in subsequent search steps to reuse discoveries made by other branches rather than rediscover the same information. Empirically, experiments on HMMT and AIME benchmarks show that CPT establishes a stronger accuracy--latency Pareto frontier than strong baselines across rollout budgets and model scales, highlighting search-time collaboration as an effective direction for efficient parallel TTS.