FadeMem：面向自回归视频生成的距离感知内存合并机制

2026-06-09 08:00·24天前

AI 摘要

自回归视频生成器的历史 KV cache 随视频长度增长。FadeMem 提出距离感知内存合并机制，在固定缓存预算下将历史 KV 块组织成时间层次，利用频率依赖的时间衰减（细粒度细节快速去相关，粗粒度场景结构保持更久）。生成时新历史作为细粒度条目插入，较旧相邻条目按幂律调度逐步合并，形成近密远疏内存。无需改动架构，即可保留近期上下文并为身份与场景连贯性提供紧凑长程锚点。实验表明在主体一致性、背景稳定性和时间连贯性上优于现有有界缓存策略。

原文 · 未翻译

Autoregressive video generators synthesize long videos by generating successive temporal segments, but their historical KV cache grows with video length. Existing bounded-cache methods reduce this cost with local windows, sink tokens, or compressed memory states, yet they usually assign fixed roles to different parts of the history. We propose FadeMem, a distance-aware KV memory consolidation mechanism that organizes historical KV blocks into a temporal hierarchy under a fixed cache budget. This design is motivated by frequency-dependent temporal decay: fine details decorrelate quickly, while coarse scene structure and identity remain useful over longer horizons. During generation, new history is inserted as fine-grained entries, while older adjacent entries are progressively merged under a power-law temporal allocation schedule, yielding a dense-near, sparse-far memory within one cache. Without architectural changes, FadeMem preserves recent context for short-term dynamics and compact long-range anchors for identity and scene coherence. Experiments show improved subject consistency, background stability, and temporal coherence over existing bounded-cache strategies.

HuggingFace Daily Papers（社区热门论文）

66导出 Markdown

FadeMem：面向自回归视频生成的距离感知内存合并机制

2026-06-09 08:00·24天前

阅读原文· arxiv.org

AI 摘要

原文 · 保持原样，未翻译