缩放因子在LoRA优化中的隐藏力量

2026-06-11 08:00·22天前

AI 摘要

研究揭示，LoRA中缩放因子α与学习率作用不同，α才是有效优化的主导因素。通过Signal-Drift框架与实证，发现三个机制：LoRA的光谱抑制平滑优化面，使标准超参数过于保守；α放大任务信号而不增加漂移比，比学习率更有效加速收敛；最优α与秩呈平方根律次线性关系，现有秩绑定启发式缩放不足。基于此提出LoRA-α框架，将α恢复至原则性区间，兼容标准小学习率，持续提升性能并简化超参数搜索。

原文 · 未翻译

In Low-Rank Adaptation (LoRA), the scaling factor α is often treated as a mere complement to the learning rate, yet its role in optimization remains poorly understood. In this paper, we reveal that the scaling factor α and the learning rate function differently, with α emerging as the dominant driver of effective optimization, delivering gains that cannot be replicated by learning rate scaling alone. Through the synergy of extensive empirical analysis and a theoretical Signal-Drift framework, we uncover three findings into LoRA's scaling mechanism: First, LoRA's spectral suppression smooths the optimization landscape, rendering standard hyperparameters overly conservative and creating an optimization gap. Second, when leveraging this smoothness to accelerate convergence, α outperforms the learning rate by amplifying the task signal without increasing the drift ratio. Third, the optimal scaling factor follows a sublinear relationship with the rank, well characterized by a square-root law with an unexpectedly large coefficient, revealing the insufficient scaling of existing rank-tied heuristics. Based on these insights, we propose LoRA-α, a minimalist framework that restores α to its principled regime, making LoRA compatible with standard small learning rates. Extensive evaluations across diverse tasks demonstrate that LoRA-α consistently improves performance while streamlining hyperparameter search, unleashing the learning potential of LoRA.

HuggingFace Daily Papers（社区热门论文）

54导出 Markdown

缩放因子在LoRA优化中的隐藏力量

2026-06-11 08:00·22天前

阅读原文· arxiv.org

AI 摘要

原文 · 保持原样，未翻译