Adaptive 3D Gaussian Splatting Video Streaming: Visual Saliency-Aware Tiling and Meta-Learning-Based Bitrate Adaptation
- URL: http://arxiv.org/abs/2507.14454v1
- Date: Sat, 19 Jul 2025 03:00:36 GMT
- Title: Adaptive 3D Gaussian Splatting Video Streaming: Visual Saliency-Aware Tiling and Meta-Learning-Based Bitrate Adaptation
- Authors: Han Gong, Qiyue Li, Jie Li, Zhi Liu,
- Abstract summary: 3D splatting video (3DGS) streaming has emerged as a research hotspot in both academia and industry.<n>We propose an adaptive 3DGS tiling technique guided by saliency analysis, which integrates both spatial and temporal features.<n>We also introduce a novel quality assessment framework for 3DGS video that jointly evaluates spatial-domain degradation in 3DGS representations during streaming and the quality of the resulting 2D rendered images.
- Score: 9.779419462403144
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D Gaussian splatting video (3DGS) streaming has recently emerged as a research hotspot in both academia and industry, owing to its impressive ability to deliver immersive 3D video experiences. However, research in this area is still in its early stages, and several fundamental challenges, such as tiling, quality assessment, and bitrate adaptation, require further investigation. In this paper, we tackle these challenges by proposing a comprehensive set of solutions. Specifically, we propose an adaptive 3DGS tiling technique guided by saliency analysis, which integrates both spatial and temporal features. Each tile is encoded into versions possessing dedicated deformation fields and multiple quality levels for adaptive selection. We also introduce a novel quality assessment framework for 3DGS video that jointly evaluates spatial-domain degradation in 3DGS representations during streaming and the quality of the resulting 2D rendered images. Additionally, we develop a meta-learning-based adaptive bitrate algorithm specifically tailored for 3DGS video streaming, achieving optimal performance across varying network conditions. Extensive experiments demonstrate that our proposed approaches significantly outperform state-of-the-art methods.
Related papers
- Adaptive 3D Gaussian Splatting Video Streaming [28.283254336752602]
We introduce an innovative framework for 3DGS volumetric video streaming.<n>By employing hybrid saliency tiling and differentiated quality modeling, we achieve efficient data compression and adaptation to bandwidth fluctuations.<n>Our method demonstrated superiority over existing approaches in various aspects, including video quality, compression effectiveness, and transmission rate.
arXiv Detail & Related papers (2025-07-19T01:45:24Z) - Learning from Streaming Video with Orthogonal Gradients [62.51504086522027]
We address the challenge of representation learning from a continuous stream of video as input, in a self-supervised manner.<n>This differs from the standard approaches to video learning where videos are chopped and shuffled during training in order to create a non-redundant batch.<n>We demonstrate the drop in performance when moving from shuffled to sequential learning on three tasks.
arXiv Detail & Related papers (2025-04-02T17:59:57Z) - EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization.<n>We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z) - HuGDiffusion: Generalizable Single-Image Human Rendering via 3D Gaussian Diffusion [50.02316409061741]
HuGDiffusion is a learning pipeline to achieve novel view synthesis (NVS) of human characters from single-view input images.<n>We aim to generate the set of 3DGS attributes via a diffusion-based framework conditioned on human priors extracted from a single image.<n>Our HuGDiffusion shows significant performance improvements over the state-of-the-art methods.
arXiv Detail & Related papers (2025-01-25T01:00:33Z) - MVS-GS: High-Quality 3D Gaussian Splatting Mapping via Online Multi-View Stereo [9.740087094317735]
We propose a novel framework for high-quality 3DGS modeling using an online multi-view stereo approach.<n>Our method estimates MVS depth using sequential frames from a local time window and applies comprehensive depth refinement techniques.<n> Experimental results demonstrate that our method outperforms state-of-the-art dense SLAM methods.
arXiv Detail & Related papers (2024-12-26T09:20:04Z) - SGSST: Scaling Gaussian Splatting StyleTransfer [10.994929376259007]
This work introduces SGSST: Scaling Gaussian Splatting Style Transfer, an optimization-based method to apply style transfer to pretrained 3DGS scenes.<n>We demonstrate that a new multiscale loss based on global neural statistics, that we name SOS for Simultaneously Optimized Scales, enables style transfer to ultra-high resolution 3D scenes.
arXiv Detail & Related papers (2024-12-04T14:59:05Z) - SplatFormer: Point Transformer for Robust 3D Gaussian Splatting [18.911307036504827]
3D Gaussian Splatting (3DGS) has recently transformed photorealistic reconstruction, achieving high visual fidelity and real-time performance.<n> rendering quality significantly deteriorates when test views deviate from the camera angles used during training, posing a major challenge for applications in immersive free-viewpoint rendering and navigation.<n>We introduce SplatFormer, the first point transformer model specifically designed to operate on Gaussian splats.<n>Our model significantly improves rendering quality under extreme novel views, achieving state-of-the-art performance in these challenging scenarios and outperforming various 3DGS regularization techniques, multi-scene models tailored for sparse view synthesis, and diffusion
arXiv Detail & Related papers (2024-11-10T08:23:27Z) - LP-3DGS: Learning to Prune 3D Gaussian Splatting [71.97762528812187]
We propose learning-to-prune 3DGS, where a trainable binary mask is applied to the importance score that can find optimal pruning ratio automatically.
Experiments have shown that LP-3DGS consistently produces a good balance that is both efficient and high quality.
arXiv Detail & Related papers (2024-05-29T05:58:34Z) - Improving 3D Object Detection with Channel-wise Transformer [58.668922561622466]
We propose a two-stage 3D object detection framework (CT3D) with minimal hand-crafted design.
CT3D simultaneously performs proposal-aware embedding and channel-wise context aggregation.
It achieves the AP of 81.77% in the moderate car category on the KITTI test 3D detection benchmark.
arXiv Detail & Related papers (2021-08-23T02:03:40Z) - 2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video
Recognition [84.697097472401]
We introduce Ada3D, a conditional computation framework that learns instance-specific 3D usage policies to determine frames and convolution layers to be used in a 3D network.
We demonstrate that our method achieves similar accuracies to state-of-the-art 3D models while requiring 20%-50% less computation across different datasets.
arXiv Detail & Related papers (2020-12-29T21:40:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.