# 基于并行回火的大语言模型科学假设搜索

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-09 16:52
- AIHOT 分数：61
- AIHOT 链接：https://aihot.virxact.com/items/cmq9sy2cn0ezhslldsvh2ewty
- 原文链接：https://arxiv.org/abs/2606.10587

## AI 摘要

大语言模型用于生成科学假设，但常见进化搜索因过度优化导致多样性坍塌。本文将假设搜索建模为采样问题，目标是在固定验证预算下高效产出多样且高质量的候选假设。受并行回火算法启发，提出一种进化框架，在多个温度水平同时搜索，并通过跨温度信息交换增强探索而不破坏收敛。在分子发现、方程发现和算法发现三个领域，该方法在同等验证预算下同时提升了假设质量与多样性，且候选假设在更昂贵的下游计算验证中仍保持鲁棒。

## 正文

Large language models (LLMs) are on the rise for accelerating scientific discovery, most recently in advanced tasks such as generating valid scientific hypotheses. Yet in many discovery settings, the goal is not to identify a single best hypothesis since validation can be noisy and expensive, and scientists benefit from a set of high-quality alternative hypotheses that hedge against downstream uncertainty for the best solutions. Nevertheless, commonly used evolutionary search recipes tend to prioritize optimization over exploration in hypothesis generation, and the resulting selection pressure during the search process leads to diversity collapse. Motivated by these limitations, we formulate hypothesis search as a sampling problem, where the objective is to efficiently produce diverse, high-quality hypotheses under a fixed validation budget. Building on this perspective, we propose \ours, an evolutionary framework inspired by the classical parallel tempering algorithm that searches hypotheses at multiple temperature levels and enables principled information exchange across temperatures to improve exploration without disrupting convergence. Across domains including molecular discovery, equation discovery, and algorithm discovery, our approach consistently improves both hypothesis quality and diversity under the same validation budget, and produces candidates that remain robust under more expensive downstream computational validations.
