# TideGS：通过核外优化实现超过十亿3D高斯溅射原语的可扩展训练

- 来源：HuggingFace Daily Papers（社区热门论文）
- 发布时间：2026-05-19 08:00
- AIHOT 分数：67
- AIHOT 链接：https://aihot.virxact.com/items/cmpdwf32l07jsslk15q6u3u2h
- 原文链接：https://arxiv.org/abs/2605.20150

## AI 摘要

该框架针对3D高斯溅射训练中参数规模远超GPU内存的难题，提出了核外训练方案。它利用训练过程固有的稀疏性，将GPU内存作为工作集缓存，并通过SSD-CPU-GPU层级结构协同管理参数。关键技术包括虚拟化块几何以提升I/O局部性、异步分层流水线实现计算与I/O重叠、以及轨迹自适应差分流以高效传输增量数据。实验表明，TideGS仅需单张24GB显卡即可训练超过十亿高斯，并在大规模场景中达到了所评测单GPU基线中的最优质量，相比此前方法实现了数量级的规模突破。

## 正文

Training 3D Gaussian Splatting (3DGS) at billion-primitive scale is fundamentally memory-bound: each Gaussian primitive carries a large attribute vector, and the aggregate parameter table quickly exceeds GPU capacity, limiting prior systems to tens of millions of Gaussians on commodity single-GPU hardware. We observe that 3DGS training is inherently sparse and trajectory-conditioned: each iteration activates only the Gaussians visible from the current camera batch, so GPU memory can serve as a working-set cache rather than a persistent parameter store. Building on this insight, we introduce TideGS, an out-of-core training framework that manages parameters across an SSD-CPU-GPU hierarchy via three synergistic techniques: block-virtualized geometry for SSD-aligned spatial locality, a hierarchical asynchronous pipeline to overlap I/O with computation, and trajectory-adaptive differential streaming that transfers only incremental working-set deltas between iterations. Experiments show that TideGS enables training with over one billion Gaussians on a single 24 GB GPU while achieving the best reconstruction quality among evaluated single-GPU baselines on large-scale scenes, scaling beyond prior out-of-core baselines (e.g., approximately 100M Gaussians) and standard in-memory training (e.g., approximately 11M Gaussians).
