# PhotoQuilt：通过自举式分块去噪实现无需训练的任意分辨率光马赛克生成

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-29 08:00
- AIHOT 分数：51
- AIHOT 链接：https://aihot.virxact.com/items/cmr1krpxv03rjslnlnamjti7i
- 原文链接：https://arxiv.org/abs/2606.30968

## AI 摘要

PhotoQuilt提出无需训练的任意分辨率光马赛克生成框架，通过自举式分块去噪解决高分辨率生成中局部细节与全局结构难以兼顾的问题。先低分辨率生成全局构图，再升维加噪恢复生成能力，然后在固定分块内独立去噪，使每个分块形成独立图像的同时保持整体布局一致。该方法避免了二次注意力开销，可扩展到大型画布。实验表明，PhotoQuilt在全局结构和局部真实感上均优于现有基线。

## 正文

Photomosaics are large images whose local regions are seen as independent tiles while their overall arrangement forms a coherent scene. Generating them at high resolution, with every tile convincing in its own right, is computationally expensive, since the canvas must hold many detailed tiles at once. We present PhotoQuilt, a training-free framework that generates photomosaics at arbitrary resolution. Diffusion models struggle to satisfy both scales at once, as direct high-resolution generation is costly and tends toward one smooth image rather than a mosaic, while patch-based tiling keeps local detail but loses global structure. PhotoQuilt resolves this with a bootstrapped tiled denoising procedure. We first produce a global composition at low resolution to fix the layout, then upscale it in latent space and re-inject noise to restore generative capacity. Denoising proceeds within fixed tiles, so each forms its own image while the shared global structure holds them in one layout. Because tile generation is handled separately, PhotoQuilt scales to large canvases without quadratic attention cost. Experiments show that PhotoQuilt outperforms current baselines on both global structure and local realism.