Related papers: Implicit Gaussian Splatting with Efficient Multi-Level Tri-Plane Representation

Implicit Gaussian Splatting with Efficient Multi-Level Tri-Plane Representation

URL: http://arxiv.org/abs/2408.10041v2
Date: Sat, 09 Nov 2024 09:33:54 GMT
Title: Implicit Gaussian Splatting with Efficient Multi-Level Tri-Plane Representation
Authors: Minye Wu, Tinne Tuytelaars,
Abstract summary: Implicit Gaussian Splatting (IGS) is an innovative hybrid model that integrates explicit point clouds with implicit feature embeddings. We introduce a level-based progressive training scheme, which incorporates explicit spatial regularization. Our algorithm can deliver high-quality rendering using only a few MBs, effectively balancing storage efficiency and rendering fidelity.
Score: 45.582869951581785
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advancements in photo-realistic novel view synthesis have been significantly driven by Gaussian Splatting (3DGS). Nevertheless, the explicit nature of 3DGS data entails considerable storage requirements, highlighting a pressing need for more efficient data representations. To address this, we present Implicit Gaussian Splatting (IGS), an innovative hybrid model that integrates explicit point clouds with implicit feature embeddings through a multi-level tri-plane architecture. This architecture features 2D feature grids at various resolutions across different levels, facilitating continuous spatial domain representation and enhancing spatial correlations among Gaussian primitives. Building upon this foundation, we introduce a level-based progressive training scheme, which incorporates explicit spatial regularization. This method capitalizes on spatial correlations to enhance both the rendering quality and the compactness of the IGS representation. Furthermore, we propose a novel compression pipeline tailored for both point clouds and 2D feature grids, considering the entropy variations across different levels. Extensive experimental evaluations demonstrate that our algorithm can deliver high-quality rendering using only a few MBs, effectively balancing storage efficiency and rendering fidelity, and yielding results that are competitive with the state-of-the-art.

Related papers

H3R: Hybrid Multi-view Correspondence for Generalizable 3D Reconstruction [39.22287224290769]
H3R is a hybrid framework that integrates latent fusion with attention-based feature aggregation.<n>By integrating both paradigms, our approach enhances generalization while converging 2$times$ faster than existing methods.<n>Our method supports variable-number and high-resolution input views while demonstrating robust cross-dataset generalization.
arXiv Detail & Related papers (2025-08-05T05:56:30Z)
FHGS: Feature-Homogenized Gaussian Splatting [7.238124816235862]
$textitFHGS$ is a novel 3D feature fusion framework inspired by physical models.<n>It can achieve high-precision mapping of arbitrary 2D features from pre-trained models to 3D scenes while preserving the real-time rendering efficiency of 3DGS.
arXiv Detail & Related papers (2025-05-25T14:08:49Z)
Diffusion-Guided Gaussian Splatting for Large-Scale Unconstrained 3D Reconstruction and Novel View Synthesis [22.767866875051013]
We propose GS-Diff, a novel 3DGS framework guided by a multi-view diffusion model to address limitations of current methods. By generating pseudo-observations conditioned on multi-view inputs, our method transforms under-constrained 3D reconstruction problems into well-posed ones. Experiments on four benchmarks demonstrate that GS-Diff consistently outperforms state-of-the-art baselines by significant margins.
arXiv Detail & Related papers (2025-04-02T17:59:46Z)
GP-GS: Gaussian Processes for Enhanced Gaussian Splatting [10.45038376276218]
This paper proposes a novel 3D reconstruction framework that achieves adaptive and uncertainty-guided densification of sparse SfM point clouds. The pipeline utilizes uncertainty estimates to guide the pruning of high-variance predictions. Experiments conducted on synthetic and real-world datasets validate the effectiveness and practicality of the proposed framework.
arXiv Detail & Related papers (2025-02-04T12:50:16Z)
TSGaussian: Semantic and Depth-Guided Target-Specific Gaussian Splatting from Sparse Views [18.050257821756148]
TSGaussian is a novel framework that combines semantic constraints with depth priors to avoid geometry degradation in novel view synthesis tasks. Our approach prioritizes computational resources on designated targets while minimizing background allocation. Extensive experiments demonstrate that TSGaussian outperforms state-of-the-art methods on three standard datasets.
arXiv Detail & Related papers (2024-12-13T11:26:38Z)
G2SDF: Surface Reconstruction from Explicit Gaussians with Implicit SDFs [84.07233691641193]
We introduce G2SDF, a novel approach that integrates a neural implicit Signed Distance Field into the Gaussian Splatting framework. G2SDF achieves superior quality than prior works while maintaining the efficiency of 3DGS.
arXiv Detail & Related papers (2024-11-25T20:07:07Z)
Geometric Algebra Planes: Convex Implicit Neural Volumes [70.12234371845445]
We show that GA-Planes is equivalent to a sparse low-rank factor plus low-resolution matrix. We also show that GA-Planes can be adapted for many existing representations.
arXiv Detail & Related papers (2024-11-20T18:21:58Z)
DeSiRe-GS: 4D Street Gaussians for Static-Dynamic Decomposition and Surface Reconstruction for Urban Driving Scenes [71.61083731844282]
We present DeSiRe-GS, a self-supervised gaussian splatting representation. It enables effective static-dynamic decomposition and high-fidelity surface reconstruction in complex driving scenarios.
arXiv Detail & Related papers (2024-11-18T05:49:16Z)
PEP-GS: Perceptually-Enhanced Precise Structured 3D Gaussians for View-Adaptive Rendering [3.285531771049763]
Recent advances in structured 3D Gaussians for view-adaptive rendering have demonstrated promising results in neural scene representation. We present PEP-GS, a novel framework that enhances structured 3D Gaussians through three key innovations. Our comprehensive evaluation across multiple datasets indicates that, compared to the current state-of-the-art methods, these improvements are particularly evident in challenging scenarios.
arXiv Detail & Related papers (2024-11-08T17:42:02Z)
PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting [54.7468067660037]
PF3plat sets a new state-of-the-art across all benchmarks, supported by comprehensive ablation studies validating our design choices. Our framework capitalizes on fast speed, scalability, and high-quality 3D reconstruction and view synthesis capabilities of 3DGS.
arXiv Detail & Related papers (2024-10-29T15:28:15Z)
SuperGS: Super-Resolution 3D Gaussian Splatting via Latent Feature Field and Gradient-guided Splitting [3.5757604402398697]
SuperResolution 3DGS (SuperGS) is an expansion of 3DGS designed with a two-stage coarse-to-fine training framework. SuperGS surpasses state-of-the-art HRNVS methods on challenging real-world datasets using only low-resolution inputs.
arXiv Detail & Related papers (2024-10-03T15:18:28Z)
GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization [1.4466437171584356]
We propose a two-stage procedure that integrates dense and robust keypoint descriptors from the lightweight XFeat feature extractor into 3DGS. In the second stage, the initial pose estimate is refined by minimizing the rendering-based photometric warp loss. Benchmarking on widely used indoor and outdoor datasets demonstrates improvements over recent neural rendering-based localization methods.
arXiv Detail & Related papers (2024-09-24T23:18:32Z)
Graph and Skipped Transformer: Exploiting Spatial and Temporal Modeling Capacities for Efficient 3D Human Pose Estimation [36.93661496405653]
We take a global approach to exploit Transformer-temporal information with a concise Graph and Skipped Transformer architecture. Specifically, in 3D pose stage, coarse-grained body parts are deployed to construct a fully data-driven adaptive model. Experiments are conducted on Human3.6M, MPI-INF-3DHP and Human-Eva benchmarks.
arXiv Detail & Related papers (2024-07-03T10:42:09Z)
S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR) Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection. In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z)
Learning transformer-based heterogeneously salient graph representation for multimodal remote sensing image classification [42.15709954199397]
A transformer-based heterogeneously salient graph representation (THSGR) approach is proposed in this paper. First, a multimodal heterogeneous graph encoder is presented to encode distinctively non-Euclidean structural features from heterogeneous data. A self-attention-free multi-convolutional modulator is designed for effective and efficient long-term dependency modeling.
arXiv Detail & Related papers (2023-11-17T04:06:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.