# 去中心化指令微调：冲突感知切分与权重合并

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-01 08:00
- AIHOT 分数：69
- AIHOT 链接：https://aihot.virxact.com/items/cmpxgncg903rzslckxrhbhm92
- 原文链接：https://arxiv.org/abs/2606.01717

## AI 摘要

针对多模态大模型指令微调中的梯度干扰与高带宽同步瓶颈，MERIT提出了一种去中心化、可合并的微调流水线。该方法通过估计数据集间的梯度冲突，沿主成分分析（PCA）冲突轴进行切分，使各部分独立训练无需通信，最后通过基于token频率的加权平均进行一次权重合并。在Qwen2-VL-3B模型上使用136个Vision-FLAN任务评估，MERIT将8个基准测试的平均得分从联合训练的54.3提升至57.0。该流程同样可扩展至1.6M样本、176个来源的7B模型，以最小开销匹配或超越集中式联合训练。

## 正文

Instruction tuning aligns large language models, including multimodal ones, with diverse user intents, but scaling to heterogeneous mixtures is hindered by gradient interference and bandwidth-heavy synchronization. We ask whether these two bottlenecks can be addressed jointly by training parts of the mixture independently and reconciling them once in parameter space. We develop a local quadratic theory inside a shared flat basin that yields three results: weight merging produces a curvature-weighted variance reduction; PCA-aligned conflict splitting maximizes this gain along high-curvature directions; and merging additionally acts as spectral filtering with implicit norm regularization. These results directly motivate MERIT, a decentralized merge-ready instruction-tuning pipeline that estimates dataset-level gradient conflicts, partitions the mixture along the top PCA conflict axes, fine-tunes each partition independently with no inter-partition communication, and merges once via token-weighted averaging. On Qwen2.5-VL-3B with 136 Vision-FLAN tasks, MERIT improves the 8-benchmark average from 54.3 (joint training) to 57.0. The same recipe scales to a 7B model on a 1.6M-example, 176-source mixture -- matching or exceeding centralized joint training with minimal cost overhead -- and transfers to text-only FLAN. Our code is available at https://github.com/naver-ai/merit.
