# ReImagine：通过图像优先合成重新思考可控高质量人体视频生成

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-04-21 08:00
- AIHOT 链接：https://aihot.virxact.com/items/cmob5e9hb061lsl1yxm31p4vf
- 原文链接：https://arxiv.org/abs/2604.19720

## AI 摘要

研究团队提出ReImagine方法，采用图像优先策略解决人体视频生成中外观、运动与视角联合建模的难题。该方法将外观建模与时间一致性解耦，通过预训练图像主干学习高质量外观作为视频合成先验，结合SMPL-X运动引导与免训练的时间细化阶段，实现姿态和视角可控的高质量视频生成。团队同时发布了规范人体数据集与组合式人体图像合成辅助模型，代码与数据均已开源。

## 正文

Human video generation remains challenging due to the difficulty of jointly modeling human appearance, motion, and camera viewpoint under limited multi-view data. Existing methods often address these factors separately, resulting in limited controllability or reduced visual quality. We revisit this problem from an image-first perspective, where high-quality human appearance is learned via image generation and used as a prior for video synthesis, decoupling appearance modeling from temporal consistency. We propose a pose- and viewpoint-controllable pipeline that combines a pretrained image backbone with SMPL-X-based motion guidance, together with a training-free temporal refinement stage based on a pretrained video diffusion model. Our method produces high-quality, temporally consistent videos under diverse poses and viewpoints. We also release a canonical human dataset and an auxiliary model for compositional human image synthesis. Code and data are publicly available at https://github.com/Taited/ReImagine.
