# MergePipe：通过预算专家读取实现可扩展的权重空间模型合并

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-05-28 08:00
- AIHOT 分数：49
- AIHOT 链接：https://aihot.virxact.com/items/cmpyy97hn04h9sli3ahqe9ybp
- 原文链接：https://arxiv.org/abs/2605.29489

## AI 摘要

MergePipe 是一个预算感知的执行层，将大语言模型（LLM）权重空间合并转化为专家访问集问题。它在共享权重坐标系下，根据显式 I/O 预算选择要读取的专家增量块，生成确定性访问计划并执行合并。在 Qwen 和 Llama 合并工作负载上，MergePipe 将专家读取 I/O 最多减少一个数量级，实现最高 11 倍加速；参数偏差约为 \(10^{-3}\)，且下游基准测试未出现单调退化。

## 正文

Weight-space model merging is usually formulated as an algebraic operation on checkpoints, yet at LLM scale the limiting resource is often the set of expert weights that must be read. We introduce MergePipe, a budget-aware execution layer that casts LLM merging as an expert access-set problem: given a merge operator and a checkpoint family in a shared weight coordinate system, choose which expert delta blocks to access under an explicit I/O budget. MergePipe indexes parameter blocks, builds deterministic access plans, and executes the induced budgeted merge with replayable manifests. The plan is budget-sound by construction and recovers the full-read merge at full budget; for fixed-coefficient additive operators, the omitted-update error is bounded by the norm of omitted deltas. Across Qwen and Llama merging workloads, MergePipe reduces expert-read I/O by up to an order of magnitude and achieves up to 11times speedups. Representative budget sweeps show O(10^{-3}) parameter deviation from full-read merges and no monotonic degradation on downstream benchmarks.