ReImagine：通过图像优先合成重新思考可控高质量人体视频生成

2026-04-21 08:00·73天前

AI 摘要

研究团队提出ReImagine方法，采用图像优先策略解决人体视频生成中外观、运动与视角联合建模的难题。该方法将外观建模与时间一致性解耦，通过预训练图像主干学习高质量外观作为视频合成先验，结合SMPL-X运动引导与免训练的时间细化阶段，实现姿态和视角可控的高质量视频生成。团队同时发布了规范人体数据集与组合式人体图像合成辅助模型，代码与数据均已开源。

原文 · 未翻译

Human video generation remains challenging due to the difficulty of jointly modeling human appearance, motion, and camera viewpoint under limited multi-view data. Existing methods often address these factors separately, resulting in limited controllability or reduced visual quality. We revisit this problem from an image-first perspective, where high-quality human appearance is learned via image generation and used as a prior for video synthesis, decoupling appearance modeling from temporal consistency. We propose a pose- and viewpoint-controllable pipeline that combines a pretrained image backbone with SMPL-X-based motion guidance, together with a training-free temporal refinement stage based on a pretrained video diffusion model. Our method produces high-quality, temporally consistent videos under diverse poses and viewpoints. We also release a canonical human dataset and an auxiliary model for compositional human image synthesis. Code and data are publicly available at https://github.com/Taited/ReImagine.

HuggingFace Daily Papers（社区热门论文）

导出 Markdown

ReImagine：通过图像优先合成重新思考可控高质量人体视频生成

2026-04-21 08:00·73天前

阅读原文· arxiv.org

AI 摘要

原文 · 保持原样，未翻译