Related papers: Occam's LGS: A Simple Approach for Language Gaussian Splatting

Occam's LGS: A Simple Approach for Language Gaussian Splatting

URL: http://arxiv.org/abs/2412.01807v1
Date: Mon, 02 Dec 2024 18:50:37 GMT
Title: Occam's LGS: A Simple Approach for Language Gaussian Splatting
Authors: Jiahuan Cheng, Jan-Nico Zaech, Luc Van Gool, Danda Pani Paudel,
Abstract summary: We show that sophisticated techniques for language-grounded 3D Gaussian Splatting are simply unnecessary.<n>We apply Occam's razor to the task at hand and perform weighted multi-view feature aggregation.<n>Our results offer us state-of-the-art results with a speed-up of two orders of magnitude.
Score: 57.00354758206751
License: http://creativecommons.org/licenses/by/4.0/
Abstract: TL;DR: Gaussian Splatting is a widely adopted approach for 3D scene representation that offers efficient, high-quality 3D reconstruction and rendering. A major reason for the success of 3DGS is its simplicity of representing a scene with a set of Gaussians, which makes it easy to interpret and adapt. To enhance scene understanding beyond the visual representation, approaches have been developed that extend 3D Gaussian Splatting with semantic vision-language features, especially allowing for open-set tasks. In this setting, the language features of 3D Gaussian Splatting are often aggregated from multiple 2D views. Existing works address this aggregation problem using cumbersome techniques that lead to high computational cost and training time. In this work, we show that the sophisticated techniques for language-grounded 3D Gaussian Splatting are simply unnecessary. Instead, we apply Occam's razor to the task at hand and perform weighted multi-view feature aggregation using the weights derived from the standard rendering process, followed by a simple heuristic-based noisy Gaussian filtration. Doing so offers us state-of-the-art results with a speed-up of two orders of magnitude. We showcase our results in two commonly used benchmark datasets: LERF and 3D-OVS. Our simple approach allows us to perform reasoning directly in the language features, without any compression whatsoever. Such modeling in turn offers easy scene manipulation, unlike the existing methods -- which we illustrate using an application of object insertion in the scene. Furthermore, we provide a thorough discussion regarding the significance of our contributions within the context of the current literature. Project Page: https://insait-institute.github.io/OccamLGS/

Related papers

OpenSplat3D: Open-Vocabulary 3D Instance Segmentation using Gaussian Splatting [52.40697058096931]
3D Gaussian Splatting (3DGS) has emerged as a powerful representation for neural scene reconstruction.<n>We introduce an approach for open-vocabulary 3D instance segmentation without requiring manual labeling, termed OpenSplat3D.<n>We show results on LERF-mask and LERF-OVS as well as the full ScanNet++ validation set, demonstrating the effectiveness of our approach.
arXiv Detail & Related papers (2025-06-09T12:37:15Z)
Training-Free Hierarchical Scene Understanding for Gaussian Splatting with Superpoint Graphs [16.153129392697885]
We introduce a training-free framework that constructs a superpoint graph directly from Gaussian primitives. The superpoint graph partitions the scene into spatially compact and semantically coherent regions, forming view-consistent 3D entities. Our method achieves state-of-the-art open-vocabulary segmentation performance, with semantic field reconstruction completed over $30times$ faster.
arXiv Detail & Related papers (2025-04-17T17:56:07Z)
SuperGSeg: Open-Vocabulary 3D Segmentation with Structured Super-Gaussians [77.77265204740037]
3D Gaussian Splatting has recently gained traction for its efficient training and real-time rendering. We introduce SuperGSeg, a novel approach that fosters cohesive, context-aware scene representation. SuperGSeg outperforms prior works on both open-vocabulary object localization and semantic segmentation tasks.
arXiv Detail & Related papers (2024-12-13T16:01:19Z)
SmileSplat: Generalizable Gaussian Splats for Unconstrained Sparse Images [91.28365943547703]
A novel generalizable Gaussian Splatting method, SmileSplat, is proposed to reconstruct pixel-aligned Gaussian surfels for diverse scenarios. The proposed method achieves state-of-the-art performance in various 3D vision tasks.
arXiv Detail & Related papers (2024-11-27T05:52:28Z)
GPS-Gaussian+: Generalizable Pixel-wise 3D Gaussian Splatting for Real-Time Human-Scene Rendering from Sparse Views [67.34073368933814]
We propose a generalizable Gaussian Splatting approach for high-resolution image rendering under a sparse-view camera setting. We train our Gaussian parameter regression module on human-only data or human-scene data, jointly with a depth estimation module to lift 2D parameter maps to 3D space. Experiments on several datasets demonstrate that our method outperforms state-of-the-art methods while achieving an exceeding rendering speed.
arXiv Detail & Related papers (2024-11-18T08:18:44Z)
HiSplat: Hierarchical 3D Gaussian Splatting for Generalizable Sparse-View Reconstruction [46.269350101349715]
HiSplat is a novel framework for generalizable 3D Gaussian Splatting. It generates hierarchical 3D Gaussians via a coarse-to-fine strategy. It significantly enhances reconstruction quality and cross-dataset generalization.
arXiv Detail & Related papers (2024-10-08T17:59:32Z)
SplatLoc: 3D Gaussian Splatting-based Visual Localization for Augmented Reality [50.179377002092416]
We propose an efficient visual localization method capable of high-quality rendering with fewer parameters. Our method achieves superior or comparable rendering and localization performance to state-of-the-art implicit-based visual localization approaches.
arXiv Detail & Related papers (2024-09-21T08:46:16Z)
Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting [27.974762304763694]
We introduce Semantic Gaussians, a novel open-vocabulary scene understanding approach based on 3D Gaussian Splatting. Unlike existing methods, we design a versatile projection approach that maps various 2D semantic features into a novel semantic component of 3D Gaussians. We build a 3D semantic network that directly predicts the semantic component from raw 3D Gaussians for fast inference.
arXiv Detail & Related papers (2024-03-22T21:28:19Z)
Recent Advances in 3D Gaussian Splatting [31.3820273122585]
3D Gaussian Splatting has greatly accelerated rendering speed of novel view synthesis. The explicit representation of 3D Gaussian Splatting facilitates editing tasks like dynamic reconstruction, geometry editing, and physical simulation. We present a literature review of recent 3D Gaussian Splatting methods, which can be roughly classified into 3D reconstruction, 3D editing, and other downstream applications.
arXiv Detail & Related papers (2024-03-17T07:57:08Z)
GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering [112.16239342037714]
GES (Generalized Exponential Splatting) is a novel representation that employs Generalized Exponential Function (GEF) to model 3D scenes. With the aid of a frequency-modulated loss, GES achieves competitive performance in novel-view synthesis benchmarks.
arXiv Detail & Related papers (2024-02-15T17:32:50Z)
SAGD: Boundary-Enhanced Segment Anything in 3D Gaussian via Gaussian Decomposition [66.80822249039235]
3D Gaussian Splatting has emerged as an alternative 3D representation for novel view synthesis. We propose SAGD, a conceptually simple yet effective boundary-enhanced segmentation pipeline for 3D-GS. Our approach achieves high-quality 3D segmentation without rough boundary issues, which can be easily applied to other scene editing tasks.
arXiv Detail & Related papers (2024-01-31T14:19:03Z)
Feature 3DGS: Supercharging 3D Gaussian Splatting to Enable Distilled Feature Fields [54.482261428543985]
Methods that use Neural Radiance fields are versatile for traditional tasks such as novel view synthesis. 3D Gaussian splatting has shown state-of-the-art performance on real-time radiance field rendering. We propose architectural and training changes to efficiently avert this problem.
arXiv Detail & Related papers (2023-12-06T00:46:30Z)
Language Embedded 3D Gaussians for Open-Vocabulary Scene Understanding [2.517953665531978]
We introduce Language Embedded 3D Gaussians, a novel scene representation for open-vocabulary query tasks. Our representation achieves the best visual quality and language querying accuracy across current language-embedded representations.
arXiv Detail & Related papers (2023-11-30T11:50:07Z)
GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting [51.96353586773191]
We introduce textbfGS-SLAM that first utilizes 3D Gaussian representation in the Simultaneous Localization and Mapping system. Our method utilizes a real-time differentiable splatting rendering pipeline that offers significant speedup to map optimization and RGB-D rendering. Our method achieves competitive performance compared with existing state-of-the-art real-time methods on the Replica, TUM-RGBD datasets.
arXiv Detail & Related papers (2023-11-20T12:08:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.