前馈式 3D 场景建模：一种问题驱动的研究视角

2026-04-15 08:00·79天前

AI 摘要

针对前馈 3D 重建领域，该研究提出了一种独立于输出表示形式的模型设计分类体系。通过剥离隐式场与显式基元等几何表示的差异，现有方法被重新组织为五个核心问题：特征增强、几何感知、模型效率、增强策略与时序感知建模。研究系统梳理了领域基准数据集与评估标准，分类探讨了实际应用场景，并指出可扩展性、统一评估规范及世界建模等未来挑战。

原文 · 未翻译

Reconstructing 3D representations from 2D inputs is a fundamental task in computer vision and graphics, serving as a cornerstone for understanding and interacting with the physical world. While traditional methods achieve high fidelity, they are limited by slow per-scene optimization or category-specific training, which hinders their practical deployment and scalability. Hence, generalizable feed-forward 3D reconstruction has witnessed rapid development in recent years. By learning a model that maps images directly to 3D representations in a single forward pass, these methods enable efficient reconstruction and robust cross-scene generalization. Our survey is motivated by a critical observation: despite the diverse geometric output representations, ranging from implicit fields to explicit primitives, existing feed-forward approaches share similar high-level architectural patterns, such as image feature extraction backbones, multi-view information fusion mechanisms, and geometry-aware design principles. Consequently, we abstract away from these representation differences and instead focus on model design, proposing a novel taxonomy centered on model design strategies that are agnostic to the output format. Our proposed taxonomy organizes the research directions into five key problems that drive recent research development: feature enhancement, geometry awareness, model efficiency, augmentation strategies and temporal-aware models. To support this taxonomy with empirical grounding and standardized evaluation, we further comprehensively review related benchmarks and datasets, and extensively discuss and categorize real-world applications based on feed-forward 3D models. Finally, we outline future directions to address open challenges such as scalability, evaluation standards, and world modeling.

HuggingFace Daily Papers（社区热门论文）

导出 Markdown

前馈式 3D 场景建模：一种问题驱动的研究视角

2026-04-15 08:00·79天前

阅读原文· arxiv.org

AI 摘要

原文 · 保持原样，未翻译