# DLA：面向多状态线性注意力的动态内存建模框架

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-09 08:00
- AIHOT 分数：64
- AIHOT 链接：https://aihot.virxact.com/items/cmq7h8hbj033qsl5wisvt7p7p
- 原文链接：https://arxiv.org/abs/2606.10650

## AI 摘要

大语言模型长上下文扩展受限于标准注意力的二次复杂度。现有线性注意力多状态方法采用固定合并策略，无法适应token动态重要性，造成关键token丢失。DLA提出信息感知动态状态合并，根据token级信息变化自适应确定状态边界；并引入容量有界记忆建模，通过选择性合并相邻低信息状态维护固定大小缓存。DLA在两个线性注意力模型上预训练，在16个数据集上超越现有最优方法。

## 正文

The scalability of Large Language Models (LLMs) to long contexts is fundamentally constrained by the quadratic complexity of standard attention, motivating the adoption of linear attention mechanisms with sub-quadratic cost. To improve representation capacity under long contexts, recent approaches organize memory in a multi-state manner. However, existing multi-state linear attention methods rely on fixed state merging policies that cannot adapt to dynamically varying token importance, irreversibly obscuring critical tokens and causing severe error accumulation over long sequences. To address this limitation, we propose DLA, a dynamic memory modeling framework for multi-state linear attention. DLA introduces (i) Information-Aware Dynamic State Merging, which adaptively determines state boundaries based on token-level information variation, preserving high-resolution representations around semantic transitions while aggressively summarizing stable regions, and (ii) Capacity-Bounded Memory Modeling, which maintains a fixed-size, chronologically ordered state cache by selectively merging adjacent low-information states to control memory growth with minimal information loss. We pre-train DLA on two different linear attention models and evaluate on 16 datasets across three categories. Experimental results demonstrate the superiority of DLA over state-of-the-art.