ADGaussian: Generalizable Gaussian Splatting for Autonomous Driving with Multi-modal Inputs
- URL: http://arxiv.org/abs/2504.00437v1
- Date: Tue, 01 Apr 2025 05:40:23 GMT
- Title: ADGaussian: Generalizable Gaussian Splatting for Autonomous Driving with Multi-modal Inputs
- Authors: Qi Song, Chenghong Li, Haotong Lin, Sida Peng, Rui Huang,
- Abstract summary: We present a novel approach, termed ADGaussian, for generalizable street scene reconstruction.<n>The proposed method enables high-quality rendering from single-view input.
- Score: 32.896888952578806
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel approach, termed ADGaussian, for generalizable street scene reconstruction. The proposed method enables high-quality rendering from single-view input. Unlike prior Gaussian Splatting methods that primarily focus on geometry refinement, we emphasize the importance of joint optimization of image and depth features for accurate Gaussian prediction. To this end, we first incorporate sparse LiDAR depth as an additional input modality, formulating the Gaussian prediction process as a joint learning framework of visual information and geometric clue. Furthermore, we propose a multi-modal feature matching strategy coupled with a multi-scale Gaussian decoding model to enhance the joint refinement of multi-modal features, thereby enabling efficient multi-modal Gaussian learning. Extensive experiments on two large-scale autonomous driving datasets, Waymo and KITTI, demonstrate that our ADGaussian achieves state-of-the-art performance and exhibits superior zero-shot generalization capabilities in novel-view shifting.
Related papers
- Gaussian Graph Network: Learning Efficient and Generalizable Gaussian Representations from Multi-view Images [12.274418254425019]
3D Gaussian Splatting (3DGS) has demonstrated impressive novel view synthesis performance.<n>We propose Gaussian Graph Network (GGN) to generate efficient and generalizable Gaussian representations.<n>We conduct experiments on the large-scale RealEstate10K and ACID datasets to demonstrate the efficiency and generalization of our method.
arXiv Detail & Related papers (2025-03-20T16:56:13Z) - Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction [6.273357335397336]
We propose Generative Densification, an efficient and generalizable method to densify Gaussians generated by feed-forward models.<n>We show that our method outperforms state-of-the-art approaches with comparable or smaller model sizes.
arXiv Detail & Related papers (2024-12-09T06:20:51Z) - SmileSplat: Generalizable Gaussian Splats for Unconstrained Sparse Images [91.28365943547703]
A novel generalizable Gaussian Splatting method, SmileSplat, is proposed to reconstruct pixel-aligned Gaussian surfels for diverse scenarios.<n>The proposed method achieves state-of-the-art performance in various 3D vision tasks.
arXiv Detail & Related papers (2024-11-27T05:52:28Z) - PixelGaussian: Generalizable 3D Gaussian Reconstruction from Arbitrary Views [116.10577967146762]
PixelGaussian is an efficient framework for learning generalizable 3D Gaussian reconstruction from arbitrary views.
Our method achieves state-of-the-art performance with good generalization to various numbers of views.
arXiv Detail & Related papers (2024-10-24T17:59:58Z) - MCGS: Multiview Consistency Enhancement for Sparse-View 3D Gaussian Radiance Fields [73.49548565633123]
Radiance fields represented by 3D Gaussians excel at synthesizing novel views, offering both high training efficiency and fast rendering.
Existing methods often incorporate depth priors from dense estimation networks but overlook the inherent multi-view consistency in input images.
We propose a view framework based on 3D Gaussian Splatting, named MCGS, enabling scene reconstruction from sparse input views.
arXiv Detail & Related papers (2024-10-15T08:39:05Z) - Mini-Splatting: Representing Scenes with a Constrained Number of Gaussians [4.733612131945549]
In this study, we explore the challenge of efficiently representing scenes with a constrained number of Gaussians.
We introduce strategies for densification including blur split and depth reinitialization, and simplification through intersection preserving and sampling.
Our Mini-Splatting integrates seamlessly with the originalization pipeline, providing a strong baseline for future research in Gaussian-Splatting-based works.
arXiv Detail & Related papers (2024-03-21T06:34:46Z) - GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis [70.24111297192057]
We present a new approach, termed GPS-Gaussian, for synthesizing novel views of a character in a real-time manner.
The proposed method enables 2K-resolution rendering under a sparse-view camera setting.
arXiv Detail & Related papers (2023-12-04T18:59:55Z) - RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching)
To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth.
We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z) - Exploiting Modality-Specific Features For Multi-Modal Manipulation
Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks.
Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment.
We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z) - A Generalized EigenGame with Extensions to Multiview Representation
Learning [0.28647133890966997]
Generalized Eigenvalue Problems (GEPs) encompass a range of interesting dimensionality reduction methods.
We develop an approach to solving GEPs in which all constraints are softly enforced by Lagrange multipliers.
We show that our approaches share much of the theoretical grounding of the previous Hebbian and game theoretic approaches for the linear case.
We demonstrate the effectiveness of our method for solving GEPs in the setting of canonical multiview datasets.
arXiv Detail & Related papers (2022-11-21T10:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.