Related papers: Adaptive 3D Gaussian Splatting Video Streaming

Adaptive 3D Gaussian Splatting Video Streaming

URL: http://arxiv.org/abs/2507.14432v1
Date: Sat, 19 Jul 2025 01:45:24 GMT
Title: Adaptive 3D Gaussian Splatting Video Streaming
Authors: Han Gong, Qiyue Li, Zhi Liu, Hao Zhou, Peng Yuan Zhou, Zhu Li, Jie Li,
Abstract summary: We introduce an innovative framework for 3DGS volumetric video streaming.<n>By employing hybrid saliency tiling and differentiated quality modeling, we achieve efficient data compression and adaptation to bandwidth fluctuations.<n>Our method demonstrated superiority over existing approaches in various aspects, including video quality, compression effectiveness, and transmission rate.
Score: 28.283254336752602
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The advent of 3D Gaussian splatting (3DGS) has significantly enhanced the quality of volumetric video representation. Meanwhile, in contrast to conventional volumetric video, 3DGS video poses significant challenges for streaming due to its substantially larger data volume and the heightened complexity involved in compression and transmission. To address these issues, we introduce an innovative framework for 3DGS volumetric video streaming. Specifically, we design a 3DGS video construction method based on the Gaussian deformation field. By employing hybrid saliency tiling and differentiated quality modeling of 3DGS video, we achieve efficient data compression and adaptation to bandwidth fluctuations while ensuring high transmission quality. Then we build a complete 3DGS video streaming system and validate the transmission performance. Through experimental evaluation, our method demonstrated superiority over existing approaches in various aspects, including video quality, compression effectiveness, and transmission rate.

Related papers

3DGabSplat: 3D Gabor Splatting for Frequency-adaptive Radiance Field Rendering [50.04967868036964]
3D Gaussian Splatting (3DGS) has enabled real-time rendering while maintaining high-fidelity novel view synthesis.<n>We propose 3D Gabor Splatting (3DGabSplat) that incorporates a novel 3D Gabor-based primitive with multiple directional 3D frequency responses.<n>We achieve 1.35 dBR gain over 3D with simultaneously reduced number of primitive memory consumption.
arXiv Detail & Related papers (2025-08-07T12:49:44Z)
RobustGS: Unified Boosting of Feedforward 3D Gaussian Splatting under Low-Quality Conditions [67.48495052903534]
We propose a general and efficient multi-view feature enhancement module, RobustGS.<n>It substantially improves the robustness of feedforward 3DGS methods under various adverse imaging conditions.<n>The RobustGS module can be seamlessly integrated into existing pretrained pipelines in a plug-and-play manner.
arXiv Detail & Related papers (2025-08-05T04:50:29Z)
Adaptive 3D Gaussian Splatting Video Streaming: Visual Saliency-Aware Tiling and Meta-Learning-Based Bitrate Adaptation [9.779419462403144]
3D splatting video (3DGS) streaming has emerged as a research hotspot in both academia and industry.<n>We propose an adaptive 3DGS tiling technique guided by saliency analysis, which integrates both spatial and temporal features.<n>We also introduce a novel quality assessment framework for 3DGS video that jointly evaluates spatial-domain degradation in 3DGS representations during streaming and the quality of the resulting 2D rendered images.
arXiv Detail & Related papers (2025-07-19T03:00:36Z)
EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization.<n>We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z)
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models [69.0220314849478]
TripoSG is a new streamlined shape diffusion paradigm capable of generating high-fidelity 3D meshes with precise correspondence to input images.<n>The resulting 3D shapes exhibit enhanced detail due to high-resolution capabilities and demonstrate exceptional fidelity to input images.<n>To foster progress and innovation in the field of 3D generation, we will make our model publicly available.
arXiv Detail & Related papers (2025-02-10T16:07:54Z)
BVI-CR: A Multi-View Human Dataset for Volumetric Video Compression [14.109939177281069]
BVI-CR contains 18 multi-view RGB-D captures and their corresponding textured polygonal meshes. Each video sequence contains 10 views in 1080p resolution with durations between 10-15 seconds at 30FPS. Results show the great potential of neural representation based methods in volumetric video compression.
arXiv Detail & Related papers (2024-11-17T23:22:48Z)
SwinGS: Sliding Window Gaussian Splatting for Volumetric Video Streaming with Arbitrary Length [2.4844080708094745]
This paper introduces SwinGS, a novel framework for training, delivering, and rendering volumetric video in a real-time streaming fashion.<n>We implement a prototype of SwinGS and demonstrate its streamability across various datasets and scenes.<n>We also develop an interactive WebGL viewer enabling real-time volumetric video playback on most devices with modern browsers.
arXiv Detail & Related papers (2024-09-12T05:33:15Z)
LapisGS: Layered Progressive 3D Gaussian Splatting for Adaptive Streaming [4.209963145038135]
XR requires efficient streaming of 3D online worlds, challenging current 3DGS representations to adapt to bandwidth-constrained environments.<n>This paper proposes LapisGS, a layered 3DGS that supports adaptive streaming and progressive rendering.
arXiv Detail & Related papers (2024-08-27T07:06:49Z)
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer [51.805505207941934]
We present CogVideoX, a large-scale text-to-video generation model based on diffusion transformer.<n>It can generate 10-second continuous videos aligned with text prompt, with a frame rate of 16 fps and resolution of 768 * 1360 pixels.
arXiv Detail & Related papers (2024-08-12T11:47:11Z)
Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation [35.52770785430601]
We propose a novel hybrid video autoencoder, called HVtemporalDM, which can capture intricate dependencies more effectively. The HVDM is trained by a hybrid video autoencoder which extracts a disentangled representation of the video. Our hybrid autoencoder provide a more comprehensive video latent enriching the generated videos with fine structures and details.
arXiv Detail & Related papers (2024-02-21T11:46:16Z)
Learned Video Compression via Heterogeneous Deformable Compensation Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance. More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets. Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.