# LiveEdit：面向实时扩散的流式视频编辑

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-25 08:00
- AIHOT 分数：51
- AIHOT 链接：https://aihot.virxact.com/items/cmr059nvo03zyslkic1aksa0h
- 原文链接：https://arxiv.org/abs/2606.26740

## AI 摘要

流式视频编辑面临背景保持与低延迟两大瓶颈。LiveEdit提出因果逐帧编辑框架，通过三阶段蒸馏将双向基础模型的编辑能力迁移至单向流式编辑器，实现稳定长时编辑。引入面向AR的掩码缓存跨帧复用区域计算，将推理速度提升至12.66 FPS，在流式基线中取得最优视觉质量，适用于交互式与增强现实场景。

## 正文

Streaming video editing has made rapid progress, yet practical deployment is still limited by two core issues: maintaining stable backgrounds and non-edited regions over time, and achieving the low latency required for real-time interactive scenarios. Meanwhile, recent streaming video generation methods are mostly developed for synthesis and cannot be directly applied to editing due to the strict preservation requirement and region-specific control. In this work, we present a novel streaming video editing framework that performs causal, frame-by-frame editing with strong content preservation and real-time responsiveness. Our key design is a three-stage distillation pipeline that progressively transfers editing capability from a powerful bidirectional foundation model to an efficient unidirectional streaming editor, enabling stable long-horizon edits without sacrificing visual fidelity. To further support real-time deployment, we introduce an AR-oriented mask cache that reuses region-related computation across frames, substantially reducing redundant processing and accelerating inference. Finally, we establish a dedicated benchmark for streaming video editing. Extensive evaluations demonstrate that our method achieves state-of-the-art visual quality among streaming baselines while drastically boosting inference speed to 12.66 FPS, making it suitable for interactive and augmented reality applications.
