# PACI：通过有界权重不一致实现无气泡异步流水线并行训练

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-05 08:00
- AIHOT 分数：58
- AIHOT 链接：https://aihot.virxact.com/items/cmq9bsmkd0ag6slldeu3a2t12
- 原文链接：https://arxiv.org/abs/2606.07881

## AI 摘要

针对流水线并行中同步调度有气泡、异步调度引入权重版本不匹配的问题，PACI提出一种无气泡异步方法，利用局部梯度累积作为版本控制机制，限制前向/反向版本漂移，无需权重存储、预测或全局同步。在GPT风格语言模型预训练中，PACI匹配同步1F1B-flush的稳定性与最终困惑度，保持相同峰值内存，实现完全流水线吞吐量，训练时间-准确率提升最高达1.69倍。

## 正文

Pipeline parallelism is essential for training large neural networks, but existing schedules trade off throughput, memory, and optimization consistency. Synchronous pipelines preserve forward/backward weight consistency but suffer from bubbles; asynchronous pipelines remove bubbles but introduce weight-version mismatch, typically requiring weight stashing, prediction, or correction mechanisms. We introduce PACI (Pipeline Asynchronous training with Controlled Inconsistency), a bubble-free asynchronous pipeline method that bounds forward/backward version drift without weight stashing, prediction, additional parameter copies, or global synchronization. The key idea is to use local gradient accumulation as a version-control mechanism: by slowing parameter-version evolution relative to pipeline delay, PACI limits the number of optimizer updates crossed by any micro-batch while preserving steady-state utilization. In GPT-style language-model pretraining, PACI matches the stability and final perplexity of synchronous 1F1B-flush, retains the same peak memory footprint, achieves fully utilized pipeline throughput, and improves training time-to-accuracy by up to 1.69times over the fastest flush baseline. These results show that forward/backward inconsistency need not be eliminated: when explicitly bounded, it can be safely traded for substantial efficiency gains.