# P3D-Bench：面向参数化3D生成与结构推理的多模态大语言模型基准

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-09 08:00
- AIHOT 分数：52
- AIHOT 链接：https://aihot.virxact.com/items/cmqexmgef00ypslwaxpwa0ooc
- 原文链接：https://arxiv.org/abs/2606.11152

## AI 摘要

P3D-Bench是用于评估多模态大语言模型参数化3D生成与结构推理的基准。它覆盖Text-to-3D、Image-to-3D和Assembly-3D三个任务族，从可执行性、几何保真度、拓扑、文本约束、多视图语义对齐和部件级结构六维评分。基于400个文本案例、400个图像案例及203个标注装配体对前沿MLLMs和纯文本LLMs的评测发现了三个结论：装配体任务最困难，模型无法将多部件组合成连贯结构；模型能恢复目标物体的全局形状与语义身份，但无法精确复现输入指定的参数化几何；部件级建模普遍薄弱，既无法还原每个部件的几何，也无法输出正确的部件数量。

## 正文

Multimodal large language models can write code to produce complex programs as well as use programs to do 3D modeling, which opens up a new avenue for 3D generation powered by their priors, world knowledge and reasoning. Yet existing benchmarks rarely evaluate 3D modeling through code. Such modeling demands more than runnable code: from a text or visual specification, a model must generate a parametric 3D program that is geometrically precise, semantically aligned and assembly-consistent. We introduce P3D-Bench, a benchmark for parametric 3D generation. Unlike a 3D mesh, a parametric 3D program exposes explicit dimensions, construction operations and part relations, revealing whether a model recovers a design's structure, not just its appearance. Under a unified protocol, P3D-Bench covers three task families (Text-to-3D, Image-to-3D and Assembly-3D) and scores each output for executability, geometric fidelity, topology, text-grounded constraints, multiview semantic alignment and part-level structure. We evaluate frontier MLLMs and text-only LLMs on 400 text cases, 400 image cases and 203 annotated assemblies, with domain-specific models as reference points. Our extensive evaluation yields three findings. First, assemblies are the hardest setting, where models still fail to compose multiple parts into a coherent structure. Second, models can often recover the global shape and semantic identity of the target object, yet fail to reproduce the precise parametric geometry specified by the input. Third, part-level modeling remains weak on assemblies, where models recover neither the geometry of each part nor the right number of parts. These results position P3D-Bench as a benchmark for evaluating precise parametric geometry and part-level structure in parametric 3D generation.
