# Surflo： 具有全局状态的一致3D曲面流模型

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-06-11 08:00
- AIHOT 分数：59
- AIHOT 链接：https://aihot.virxact.com/items/cmqac9xug0k4yslld5s48ug28
- 原文链接：https://arxiv.org/abs/2606.13644

## AI 摘要

Surflo将可变数量的未定位RGB视图压缩成K个潜在token（全局状态），通过流匹配独立地将噪声点传输到曲面，解码出定向3D表面点。输出不受固定网格或token预算限制：同一潜在状态可在单次前向传播中生成数千到百万个点。推理时通过ODE积分注入光度梯度，关联邻近点以抑制局部不一致。在表面指标上匹配或超越前馈基线，比需数百视图的优化方法快一个数量级，是唯一结合全局潜在与任意分辨率解码的前馈方法。

## 正文

Geometry is invariant to viewpoint, which makes any collection of images a redundant encoding of a single 3D state. Existing feed-forward reconstruction models fail to exploit this: per-view methods emit overlapping, unaligned pointmaps that grow linearly with input count, while global-latent methods commit to a fixed, low-resolution output. We introduce Surflo, which compresses a variable number of unposed RGB views into K latent tokens-one global state-and decodes oriented 3D surface points by independently transporting them from noise onto the surface via flow matching. This frees the output from any fixed grid or token budget: the same latent yields from a few thousand to a million points in a single forward pass. To suppress the local inconsistencies inherent to independent per-point decoding, an inference-time guidance term correlates nearby points by injecting a photometric gradient during ODE integration. Surflo matches or surpasses feed-forward baselines on surface metrics, runs an order of magnitude faster than optimization-based methods that require hundreds of views, and is the only feed-forward approach to combine a global latent with arbitrary-resolution decoding.
