Holo-World:面向视频世界模型的统一相机、物体与天气控制
Holo-World 是一种视频世界模型,从单张图像出发,根据显式相机控制、物体控制和可选天气指令,生成保留原场景或转换到目标天气的视频。其 Unified Scene Adapter 将世界保留与天气迁移分解为独立参数子空间,利用渲染背景、几何缓冲和物体控制维持场景结构,并建模天气依赖的外观与粒子效果。Scene-Weather Decomposed CFG 分别引导场景与天气残差,增强目标天气效果而不过度放大全条件。该模型在保持精确相机与物体控制及场景结构一致性的前提下,天气状态生成优于视频到视频的天气编辑基线。
Video world models are moving toward preserving an observed world under controllable camera and object motion while allowing its environmental state to change. Yet these controls remain isolated, and weather generation typically relies on a source video or reconstructed scene that already specifies future structure. We study a first-frame-anchored source-to-state setting, where the model starts from a single image and follows explicit camera and object controls and an optional weather instruction, then generates a video that either preserves the source world or transfers it to a target weather state. To address these challenges, we first build HoloStateData, a state video dataset that turns diverse videos into unified control samples for camera, object, and weather supervision. Second, we introduce Holo-World, a unified controllable video world model that jointly controls scene from a single image. Its Unified Scene Adapter factorizes world preservation and weather transfer into distinct parameter subspaces, using rendered background, geometry buffers, and object controls to maintain controlled scene structure while modeling weather-dependent appearance and particle effects. Additionally, Scene-Weather Decomposed CFG guides scene and weather residuals separately, strengthening target weather effects without over-amplifying the full condition. Quantitative and qualitative experiments demonstrate that Holo-World maintains precise camera and object control with consistent scene structure while transferring scenes into diverse target weather state, outperforming video-to-video weather editing baselines on weather-state generation. Our project page is available at https://xiangchenyin.github.io/Holo-World/.