PoLar：让大语言模型跳过或循环层，学习生成动态执行程序

2026-06-04 08:00·29天前

AI 摘要

研究发现，预训练LLM的层可作为模块，对每个输入灵活跳过或循环，形成动态程序（PoLar）。多数输入使用更少层即可达到相同或更高准确率，且原始模型的错误预测可通过更少层的替代程序纠正。为此，研究者提出轻量级PoLar预测网络，为每个输入生成动态跳过或重复层的执行程序。在数学推理基准上，PoLar一致优于标准推理和此前动态深度方法，常在使用更少层时提升准确率，在分布外评估中表现稳定。结果表明，固定深度执行仅捕捉了LLM潜在推理能力的一小部分。

原文 · 未翻译

Large language models (LLMs) perform inference by following a fixed depth and order, non-recurrent execution of all layers. We reveal the wide existence of training-free, flexible, dynamic program-of-layers (PoLar), where pretrained layers can be packed as modules and then skipped or looped to form a customized program for each input. For most inputs, substantially shorter program executions can achieve the same or better accuracy, while incorrect predictions of the original LLM can be corrected by alternative programs with fewer layers. These observations indicate that inference admits multiple valid latent computations beyond the standard forward pass. To efficiently achieve PoLar in practice, we propose a lightweight PoLar prediction network, which learns to generate execution programs that dynamically skip or repeat pretrained layers for each input. Experiments on mathematical reasoning benchmarks demonstrate that PoLar consistently improves accuracy over standard inference and prior dynamic-depth methods, often while executing fewer layers, and that these gains persist under out-of-distribution evaluation. Our results suggest that fixed-depth execution captures only a narrow subset of an LLM's latent reasoning capacity.

HuggingFace Daily Papers（社区热门论文）

47导出 Markdown

PoLar：让大语言模型跳过或循环层，学习生成动态执行程序

2026-06-04 08:00·29天前

阅读原文· arxiv.org

AI 摘要

原文 · 保持原样，未翻译