Saining Xie@sainingxie

2026-05-22 06:52·42天前

AI 摘要

RAEv2通过大幅简化架构并提升通用性，在文本到图像（T2I）和世界模型等任务中实现了超过10倍的收敛速度提升，同时改善了重建与生成质量。研究团队在大量实验中发现，强大的表示编码器对像素解码器至关重要。传统评估指标（如FID）已不足以全面衡量模型性能，新的评估指标（如ep@fid-k/fdr^k）揭示了生成模型领域仍存在广阔的研究空间。

check out RAEv2 led by Jas. through extensive exps， we found some really intriguing behaviors showing why strong representation encoders are key for pixel decoders. spoiler： it's not about hillclimbing fid； new metrics like ep@fid-k/fdr^k show there's a lot more left to explore！

Jaskirat SinghIn Oct last year, Representation Autoencoders provided an elegant solution to unified tokenization for understanding and generation. Today we make them a bit mo...

图像生成论文/研究

在 X 查看原推导出 Markdown

Saining Xie@sainingxie · X

60导出 Markdown

2026-05-22 06:52·42天前

在 X 看原推· x.com

AI 摘要

Jaskirat SinghIn Oct last year, Representation Autoencoders provided an elegant solution to unified tokenization for understanding and generation. Today we make them a bit mo...

图像生成论文/研究