# SpaceDG：视觉退化下的空间智能基准测试

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-05-21 08:00
- AIHOT 分数：69
- AIHOT 链接：https://aihot.virxact.com/items/cmpggte670fbdsljwgfkad0lj
- 原文链接：https://arxiv.org/abs/2605.22536

## AI 摘要

SpaceDG是首个大规模退化感知空间理解数据集，包含约100万个问答对，源自近1000个室内场景。其核心是物理基础的退化合成引擎，能将退化过程嵌入3D高斯泼溅渲染，真实模拟运动模糊、低光等九种退化类型。配套的SpaceDG-Bench基准包含1102个人工验证问题，覆盖11类推理任务。对25个模型的评估揭示，视觉退化会严重损害空间推理能力。研究表明，在SpaceDG上进行微调能显著提升模型在退化场景下的鲁棒性，性能甚至可超越人类，且不影响其在清晰图像上的表现。

## 正文

Multimodal Large Language Models (MLLMs) have made rapid progress in spatial intelligence, yet existing spatial reasoning benchmarks largely assume pristine visual inputs and overlook the degradations that commonly occur in real-world deployment, such as motion blur, low light, adverse weather, lens distortion, and compression artifacts. This raises a fundamental question: how robust is the spatial intelligence of current MLLMs when visual observations are imperfect? To answer this question, we introduce SpaceDG, the first large-scale dataset for degradation-aware spatial understanding. It is constructed with a physically grounded degradation synthesis engine that embeds degradation formation process into 3D Gaussian Splatting (3DGS) rendering, enabling realistic simulation of nine degradation types. The resulting dataset contains approximately 1M QA pairs from nearly 1,000 indoor scenes. We further introduce SpaceDG-Bench, an human-verified benchmark with 1,102 questions spanning 11 reasoning categories and 9 visual degradation types, yielding over 10K VQA instances. Evaluating 25 open- and closed-source MLLMs reveals that visual degradations consistently and substantially impair spatial reasoning, exposing a critical robustness gap. Finally, we show that finetuning on SpaceDG markedly improves degradation robustness and can even surpass human performance under degraded conditions without any performance drop on clean images, highlighting the promise of degradation-aware training for robust spatial intelligence.
