# GGT-100K：面向通用真实世界图像复原的生成式基准真值

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-05-29 08:00
- AIHOT 分数：62
- AIHOT 链接：https://aihot.virxact.com/items/cmpulkg6204v7slagegc25c76
- 原文链接：https://arxiv.org/abs/2605.31039

## AI 摘要

针对真实世界图像复原缺乏高质量配对数据的瓶颈，本研究提出“生成式基准真值”方法，利用生成式多模态基础模型从真实低质量图像合成高质量目标。通过对9个最先进模型的系统评估，发现Nano-Banana-2结合基于VLM的自适应提示词，在合成感知逼真且内容忠实的目标上能力最强。基于此，研究构建了GGT-100K数据集，包含103,707个训练对和500个测试对，覆盖多样场景与复杂退化。实验证明，该数据集能持续提升多种图像复原模型的真实世界泛化能力，尤其对微调生成式复原模型效果显著。

## 正文

Real-world image restoration (IR) is bottlenecked by the scarcity of high-quality paired training data. Synthetic datasets are abundant but often fail to model real-world degradations, while real-world paired datasets are expensive and difficult to capture. As a result, IR models trained on these datasets show limited generalization in real-world scenarios. In this work, we propose Generative Ground Truth (GGT) by using generative multimodal foundation models (MFMs) to produce high-quality (HQ) targets from real-world low-quality (LQ) images. We first conduct a systematic evaluation of nine state-of-the-art MFMs, including Nano-Banana-2 and GPT-Image-2, on images of various scenes and degradation types. The results demonstrate that Nano-Banana-2 with VLM-based adaptive prompting shows the highest capability to synthesize perceptually realistic and content-faithful HQ targets, which can serve as the GGT for the LQ input. We then employ Nano-Banana-2 to build a GGT synthesis pipeline, which involves multi-stage quality control to ensure data reliability, and construct GGT-100K, an LQ-HQ paired dataset comprising 103,707 training pairs and covering diverse scenes and complex real-world degradations. A test set of 500 image pairs is also established. Extensive experiments show that GGT-100K consistently improves the real-world generalization of a wide range of IR models, with particularly strong benefits for finetuning generative models for IR tasks. Our results suggest that MFMs can serve as practical tools for restoration-oriented data generation, and GGT-100K is a useful resource to expand the generalization boundaries of real-world IR models.
