RefGC-SR2:参考引导生成内容超分辨率与精炼
阅读原文· arxiv.org当前参考引导生成管线将高分辨率参考图像(HRRI)降采样至固定低分辨率,丢失细粒度细节,且生成步骤引入身份扭曲等伪影。现有精炼方法仍在低分辨率域操作,超分辨率方法则忽略生成管线伪影分布。论文提出RefGC-SR²任务,在后期处理阶段复用原始HRRI,同时恢复丢失细节、精炼伪影并提升分辨率。构建首个真实世界三元组数据生成管线,训练双面板条件生成器合成配对低质量锚点。提出频率感知扩散Transformer模型,从参考图像选择性注入精细细节并去除伪影。实验优于RefGCR与RefSR基线。
Reference-guided generation (e.g., object compositing, customization) has progressed rapidly, yet current pipelines share a fundamental limitation: the object-centric high-resolution reference image (HRRI) provided by users is downsampled to a fixed low-resolution (LR) before being fed into the model, so the fine-grained details are discarded before the output is even produced. In addition, the generation step then introduces its own artifacts (e.g., identity distortion) on top of this loss. Existing reference-guided generated content refinement (RefGCR) methods can correct some of these artifacts but still operate in the LR domain; reference-guided super-resolution (RefSR) methods recover resolution but assume natural-image degradations and ignore the artifact distribution of generative pipelines. To address both gaps in a single formulation, we introduce a new task: reference-guided generated content super-resolution-refinement (RefGC-SR^2), where the original HRRI is reused at the post-processing stage to recover lost details, refine generative artifacts, and upscale the output simultaneously. We construct the first real-world triplet data generation pipeline for this RefGC-SR^2 task, training a diptych-conditioned generator to synthesize paired low-quality anchors that public pretrained models cannot provide. We further present a frequency-aware diffusion transformer model for RefGC-SR^2 that selectively injects fine details from the HRRI while removing generative artifacts. Extensive experiments demonstrate that our RefGC-SR^2 model successfully (i) refines the object identity faithfully with respect to the reference, and (ii) recovers high-resolution details, so that the final result is significantly higher quality and practically more usable compared to existing RefGCR and RefSR baselines.