重新利用3D生成模型进行自回归布局生成
阅读原文· arxiv.org研究团队推出LaviGen框架,将3D生成模型重新用于3D布局生成。该方法突破传统文本推断模式,直接在原生3D空间通过自回归过程显式建模物体几何关系与物理约束,生成连贯且符合物理规律的3D场景。团队还提出融合场景、物体与指令信息的改进版3D扩散模型,并采用双引导自推出蒸馏机制提升效率与空间精度。在LayoutVLM基准测试中,LaviGen的物理合理性较现有最优方法提升19%,计算速度加快65%。
We introduce LaviGen, a framework that repurposes 3D generative models for 3D layout generation. Unlike previous methods that infer object layouts from textual descriptions, LaviGen operates directly in the native 3D space, formulating layout generation as an autoregressive process that explicitly models geometric relations and physical constraints among objects, producing coherent and physically plausible 3D scenes. To further enhance this process, we propose an adapted 3D diffusion model that integrates scene, object, and instruction information and employs a dual-guidance self-rollout distillation mechanism to improve efficiency and spatial accuracy. Extensive experiments on the LayoutVLM benchmark show LaviGen achieves superior 3D layout generation performance, with 19% higher physical plausibility than the state of the art and 65% faster computation. Our code is publicly available at https://github.com/fenghora/LaviGen.