# GenEvolve：基于工具协调视觉经验蒸馏的自我进化图像生成代理

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-05-20 08:00
- AIHOT 分数：67
- AIHOT 链接：https://aihot.virxact.com/items/cmpgl46l60ge6sljw0udggsrw
- 原文链接：https://arxiv.org/abs/2605.21605

## AI 摘要

GenEvolve是一个旨在让图像生成代理自我进化的框架。该框架将每次生成过程建模为工具协调轨迹，代理通过收集证据、选择资源并组合生成技能来完成任务。与主要依赖图像级奖励的方法不同，GenEvolve通过对比同一请求的多个轨迹，将优劣差异提炼为结构化视觉经验，并仅提供给特权教师分支。借鉴策略自蒸馏思想，这些经验为学生代理提供了密集的token级监督，从而帮助其内化更优的搜索与构建能力。研究还构建了配套的数据集与评测基准，实验表明该方法达到了最先进的性能。

## 正文

Open-ended image generation is no longer a simple prompt-to-image problem. High-quality generation often requires an agent to combine a model's internal generative ability with external resources. As requests become more diverse and demanding, we aim to develop a general image-generation agent that can self-evolve through trajectories and use tools more effectively across varied generation challenges. To this end, we propose GenEvolve, a self-evolving framework based on Tool-Orchestrated Visual Experience Distillation. In GenEvolve, each generation attempt is modeled as a tool-orchestrated trajectory, where the agent gathers evidence, selects references, invokes generation skills, and composes them into a prompt-reference program. Unlike existing agentic generation methods that mainly rely on image-level scalar rewards, GenEvolve compares multiple trajectories for the same request and abstracts best-worst differences into structured visual experience, provided only to a privileged teacher branch. Inspired by on-policy self-distillation, Visual Experience Distillation provides dense token-level supervision, helping the student internalize better search, knowledge activation, reference selection, and prompt construction. We further construct GenEvolve-Data and GenEvolve-Bench. Experiments on public benchmarks and GenEvolve-Bench show substantial gains over strong baselines, achieving state-of-the-art performance among current image-generation frameworks. Our website is as follows: https://ephemeral182.github.io/GenEvolve/
