通过对称注意力分解平衡扩散模型中的保真度与多样性:Hopfield 视角
阅读原文· arxiv.org研究将 Transformer 中的注意力矩阵表征为编码特征间关联的联想记忆矩阵。通过将其分解为对称与反对称部分,前者被解释为控制能量景观的结构,后者驱动该景观上的循环运动。基于对称部分推导出 Hopfield 风格的稳定度量,用于量化检索特征的稳定性。观察发现,这些稳定度量与生成中的保真度-多样性权衡存在有意义的相关性。最终,提出通过修改底层动态的循环来调节该权衡的可控方法。代码已开源。
We characterize the pre-softmax attention matrix QK^top in transformers as an associative memory matrix encoding pairwise associations between input features. By decomposing this matrix into its symmetric and skew-symmetric parts, we interpret the symmetric component as governing the structure of the energy landscape, and the skew-symmetric component as driving circulation on that landscape. Leveraging the energy formulation induced by the symmetric component, we derive Hopfield-style stability measures that quantify the stability of retrieved features. We observe meaningful correlations between Hopfield-style stability measures and the fidelity-diversity trade-offs in generation. Finally, we propose a controllable knob to modulate this trade-off by modifying the circulation of the underlying dynamics. Code is available at our GitHub (https://github.com/hyeon-cho/Attention-Symmetric-Decomposition).