UniMesh:统一三维网格理解与生成
阅读原文· arxiv.orgUniMesh是一个统一3D网格理解与生成的新型框架,通过单一架构联合学习两大任务。该框架引入Mesh Head连接扩散式图像生成与隐式形状解码器;提出Chain of Mesh (CoM)几何迭代推理机制,实现用户驱动的语义网格编辑闭环;并构建Actor-Evaluator-Self-reflection自反思机制,可诊断纠正3D字幕等高级任务错误。实验表明,UniMesh不仅性能优异,更实现了生成与理解的相互增强及迭代编辑能力。
Recent advances in 3D vision have led to specialized models for either 3D understanding (e.g., shape classification, segmentation, reconstruction) or 3D generation (e.g., synthesis, completion, and editing). However, these tasks are often tackled in isolation, resulting in fragmented architectures and representations that hinder knowledge transfer and holistic scene modeling. To address these challenges, we propose UniMesh, a unified framework that jointly learns 3D generation and understanding within a single architecture. First, we introduce a novel Mesh Head that acts as a cross model interface, bridging diffusion based image generation with implicit shape decoders. Second, we develop Chain of Mesh (CoM), a geometric instantiation of iterative reasoning that enables user driven semantic mesh editing through a closed loop latent, prompting, and re generation cycle. Third, we incorporate a self reflection mechanism based on an Actor Evaluator Self reflection triad to diagnose and correct failures in high level tasks like 3D captioning. Experimental results demonstrate that UniMesh not only achieves competitive performance on standard benchmarks but also unlocks novel capabilities in iterative editing and mutual enhancement between generation and understanding. Code: https://github.com/AIGeeksGroup/UniMesh. Website: https://aigeeksgroup.github.io/UniMesh.