Related papers: GS-Scale: Unlocking Large-Scale 3D Gaussian Splatting Training via Host Offloading

GS-Scale: Unlocking Large-Scale 3D Gaussian Splatting Training via Host Offloading

URL: http://arxiv.org/abs/2509.15645v1
Date: Fri, 19 Sep 2025 06:13:28 GMT
Title: GS-Scale: Unlocking Large-Scale 3D Gaussian Splatting Training via Host Offloading
Authors: Donghyun Lee, Dawoon Jeong, Jae W. Lee, Hongil Yoon,
Abstract summary: 3D Gaussian Splatting has revolutionized graphics rendering by delivering high visual quality and fast rendering speeds.<n>Training large-scale scenes at high quality remains challenging due to substantial memory demands.<n>We propose GS-Scale, a fast and memory-efficient training system for 3D Gaussian Splatting.
Score: 9.776813771006358
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The advent of 3D Gaussian Splatting has revolutionized graphics rendering by delivering high visual quality and fast rendering speeds. However, training large-scale scenes at high quality remains challenging due to the substantial memory demands required to store parameters, gradients, and optimizer states, which can quickly overwhelm GPU memory. To address these limitations, we propose GS-Scale, a fast and memory-efficient training system for 3D Gaussian Splatting. GS-Scale stores all Gaussians in host memory, transferring only a subset to the GPU on demand for each forward and backward pass. While this dramatically reduces GPU memory usage, it requires frustum culling and optimizer updates to be executed on the CPU, introducing slowdowns due to CPU's limited compute and memory bandwidth. To mitigate this, GS-Scale employs three system-level optimizations: (1) selective offloading of geometric parameters for fast frustum culling, (2) parameter forwarding to pipeline CPU optimizer updates with GPU computation, and (3) deferred optimizer update to minimize unnecessary memory accesses for Gaussians with zero gradients. Our extensive evaluations on large-scale datasets demonstrate that GS-Scale significantly lowers GPU memory demands by 3.3-5.6x, while achieving training speeds comparable to GPU without host offloading. This enables large-scale 3D Gaussian Splatting training on consumer-grade GPUs; for instance, GS-Scale can scale the number of Gaussians from 4 million to 18 million on an RTX 4070 Mobile GPU, leading to 23-35% LPIPS (learned perceptual image patch similarity) improvement.

Related papers

CLM: Removing the GPU Memory Barrier for 3D Gaussian Splatting [34.933663925174635]
CLM is a system that allows 3DGS to render large scenes using a single consumer-grade GPU.<n>It does so by offloading Gaussians to CPU memory, and loading them into GPU memory only when necessary.<n>To reduce performance and communication overheads, CLM uses a novel offloading strategy.
arXiv Detail & Related papers (2025-11-07T03:30:28Z)
ContraGS: Codebook-Condensed and Trainable Gaussian Splatting for Fast, Memory-Efficient Reconstruction [9.155819295255212]
3D Gaussian Splatting (3DGS) is a technique to model real-world scenes with high quality and real-time rendering.<n>We introduce ContraGS, a method to enable training directly on compressed 3DGS representations without reducing the Gaussian Counts.<n>We show that ContraGS significantly reduces the peak memory during training (on average 3.49X) and accelerated training and rendering (1.36X and 1.88X on average, respectively) while retraining close to state-of-art quality.
arXiv Detail & Related papers (2025-09-03T23:40:17Z)
Speedy Deformable 3D Gaussian Splatting: Fast Rendering and Compression of Dynamic Scenes [57.69608119350651]
Recent extensions of 3D Gaussian Splatting (3DGS) to dynamic scenes achieve high-quality novel view synthesis by using neural networks to predict the time-varying deformation of each Gaussian.<n>However, performing per-Gaussian neural inference at every frame poses a significant bottleneck, limiting rendering speed and increasing memory and compute requirements.<n>We present Speedy Deformable 3D Gaussian Splatting (SpeeDe3DGS), a general pipeline for accelerating the rendering speed of dynamic 3DGS and 4DGS representations by reducing neural inference through two complementary techniques.
arXiv Detail & Related papers (2025-06-09T16:30:48Z)
FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting [57.97160965244424]
3D Gaussian splatting (3DGS) has enabled various applications in 3D scene representation and novel view synthesis.<n>Previous approaches have focused on pruning less important Gaussians, effectively compressing 3DGS.<n>We present an elastic inference method for 3DGS, achieving substantial rendering performance without additional fine-tuning.
arXiv Detail & Related papers (2025-06-04T17:17:57Z)
GSta: Efficient Training Scheme with Siestaed Gaussians for Monocular 3D Scene Reconstruction [4.865050337780373]
Gaussian Splatting (GS) is a popular approach for 3D reconstruction.<n>It suffers from large storage and memory requirements.<n>We propose GSta that identifies Gaussians that have converged well during training.
arXiv Detail & Related papers (2025-04-09T09:17:56Z)
APOLLO: SGD-like Memory, AdamW-level Performance [61.53444035835778]
Large language models (LLMs) are notoriously memory-intensive during training.<n>Various memory-efficient Scals have been proposed to reduce memory usage.<n>They face critical challenges: (i) costly SVD operations; (ii) significant performance trade-offs compared to AdamW; and (iii) still substantial memory overhead to maintain competitive performance.
arXiv Detail & Related papers (2024-12-06T18:55:34Z)
Speedy-Splat: Fast 3D Gaussian Splatting with Sparse Pixels and Sparse Primitives [60.217580865237835]
3D Gaussian Splatting (3D-GS) is a recent 3D scene reconstruction technique that enables real-time rendering of novel views by modeling scenes as parametric point clouds of differentiable 3D Gaussians.<n>We identify and address two key inefficiencies in 3D-GS to substantially improve rendering speed.<n>Our Speedy-Splat approach combines these techniques to accelerate average rendering speed by a drastic $mathit6.71times$ across scenes from the Mip-NeRF 360, Tanks & Temples, and Deep Blending datasets.
arXiv Detail & Related papers (2024-11-30T20:25:56Z)
On Scaling Up 3D Gaussian Splatting Training [25.143831267916422]
3DGS is increasingly popular for 3D reconstruction due to its superior visual quality and rendering speed. Currently, 3DGS training occurs on a single GPU, limiting its ability to handle high-resolution and large-scale 3D reconstruction tasks. We introduce Grendel, a distributed system designed to partition 3DGS parameters and parallelize across multiple GPU.
arXiv Detail & Related papers (2024-06-26T17:59:28Z)
GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering [112.16239342037714]
GES (Generalized Exponential Splatting) is a novel representation that employs Generalized Exponential Function (GEF) to model 3D scenes. With the aid of a frequency-modulated loss, GES achieves competitive performance in novel-view synthesis benchmarks.
arXiv Detail & Related papers (2024-02-15T17:32:50Z)
EAGLES: Efficient Accelerated 3D Gaussians with Lightweight EncodingS [40.94643885302646]
3D Gaussian splatting (3D-GS) has gained popularity in novel-view scene synthesis. It addresses the challenges of lengthy training times and slow rendering speeds associated with Radiance Neural Fields (NeRFs) We present a technique utilizing quantized embeddings to significantly reduce per-point memory storage requirements.
arXiv Detail & Related papers (2023-12-07T18:59:55Z)
Adaptive Elastic Training for Sparse Deep Learning on Heterogeneous Multi-GPU Servers [65.60007071024629]
We show that Adaptive SGD outperforms four state-of-the-art solutions in time-to-accuracy. We show experimentally that Adaptive SGD outperforms four state-of-the-art solutions in time-to-accuracy.
arXiv Detail & Related papers (2021-10-13T20:58:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.