WaveletGaussian: Wavelet-domain Diffusion for Sparse-view 3D Gaussian Object Reconstruction
- URL: http://arxiv.org/abs/2509.19073v1
- Date: Tue, 23 Sep 2025 14:34:10 GMT
- Title: WaveletGaussian: Wavelet-domain Diffusion for Sparse-view 3D Gaussian Object Reconstruction
- Authors: Hung Nguyen, Runfa Li, An Le, Truong Nguyen,
- Abstract summary: We present WaveletGaussian, a framework for more efficient sparse-view 3D Gaussian object reconstruction.<n>Our key idea is to shift diffusion into the wavelet domain, while high-frequency subbands are refined with a lightweight network.<n> Experiments across two benchmark datasets, Mip-NeRF 360 and Omni3D, show WaveletGaussian achieves competitive rendering quality.
- Score: 5.93524554224854
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: 3D Gaussian Splatting (3DGS) has become a powerful representation for image-based object reconstruction, yet its performance drops sharply in sparse-view settings. Prior works address this limitation by employing diffusion models to repair corrupted renders, subsequently using them as pseudo ground truths for later optimization. While effective, such approaches incur heavy computation from the diffusion fine-tuning and repair steps. We present WaveletGaussian, a framework for more efficient sparse-view 3D Gaussian object reconstruction. Our key idea is to shift diffusion into the wavelet domain: diffusion is applied only to the low-resolution LL subband, while high-frequency subbands are refined with a lightweight network. We further propose an efficient online random masking strategy to curate training pairs for diffusion fine-tuning, replacing the commonly used, but inefficient, leave-one-out strategy. Experiments across two benchmark datasets, Mip-NeRF 360 and OmniObject3D, show WaveletGaussian achieves competitive rendering quality while substantially reducing training time.
Related papers
- iGaussian: Real-Time Camera Pose Estimation via Feed-Forward 3D Gaussian Splatting Inversion [62.09575122593993]
iGaussian is a two-stage feed-forward framework that achieves real-time camera pose estimation through direct 3D Gaussian inversion.<n> Experimental results on the NeRF Synthetic, Mip-NeRF 360, and T&T+DB datasets demonstrate a significant performance improvement over previous methods.
arXiv Detail & Related papers (2025-11-18T05:22:22Z) - From Coarse to Fine: Learnable Discrete Wavelet Transforms for Efficient 3D Gaussian Splatting [5.026688852582894]
AutoOpti3DGS is a training-time framework that automatically restrains Gaussian proliferation without sacrificing visual fidelity.<n>Wavelet-driven, coarse-to-fine process delays the formation of redundant fine Gaussians.
arXiv Detail & Related papers (2025-06-29T00:27:17Z) - RobustSplat: Decoupling Densification and Dynamics for Transient-Free 3DGS [79.15416002879239]
3D Gaussian Splatting has gained significant attention for its real-time, photo-realistic rendering in novel-view synthesis and 3D modeling.<n>Existing methods struggle with accurately modeling scenes affected by transient objects, leading to artifacts in the rendered images.<n>We propose RobustSplat, a robust solution based on two critical designs.
arXiv Detail & Related papers (2025-06-03T11:13:48Z) - 3DGEER: Exact and Efficient Volumetric Rendering with 3D Gaussians [15.776720879897345]
We introduce 3DGEER, an Exact and Efficient Volumetric Gaussian Rendering method.<n>Our method consistently outperforms prior methods, establishing a new state-of-the-art in real-time neural rendering.
arXiv Detail & Related papers (2025-05-29T22:52:51Z) - 3D Wavelet Convolutions with Extended Receptive Fields for Hyperspectral Image Classification [12.168520751389622]
Deep neural networks face numerous challenges in hyperspectral image classification.<n>This paper proposes WCNet, an improved 3D-DenseNet model integrated with wavelet transforms.<n> Experimental results demonstrate superior performance on the IN, UP, and KSC datasets.
arXiv Detail & Related papers (2025-04-15T01:39:42Z) - 3DGUT: Enabling Distorted Cameras and Secondary Rays in Gaussian Splatting [15.124165321341646]
We propose 3D Gaussian Unscented Transform (3DGUT), replacing the EWA splatting splatting with the Unscented Transform.<n>This enables the support of distorted time dependent effects such as rolling shutter, while retaining the efficiency of trivialization.
arXiv Detail & Related papers (2024-12-17T03:21:25Z) - CVT-xRF: Contrastive In-Voxel Transformer for 3D Consistent Radiance Fields from Sparse Inputs [65.80187860906115]
We propose a novel approach to improve NeRF's performance with sparse inputs.
We first adopt a voxel-based ray sampling strategy to ensure that the sampled rays intersect with a certain voxel in 3D space.
We then randomly sample additional points within the voxel and apply a Transformer to infer the properties of other points on each ray, which are then incorporated into the volume rendering.
arXiv Detail & Related papers (2024-03-25T15:56:17Z) - GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering [112.16239342037714]
GES (Generalized Exponential Splatting) is a novel representation that employs Generalized Exponential Function (GEF) to model 3D scenes.
With the aid of a frequency-modulated loss, GES achieves competitive performance in novel-view synthesis benchmarks.
arXiv Detail & Related papers (2024-02-15T17:32:50Z) - LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS [55.85673901231235]
We introduce LightGaussian, a method for transforming 3D Gaussians into a more compact format.
Inspired by Network Pruning, LightGaussian identifies Gaussians with minimal global significance on scene reconstruction.
LightGaussian achieves an average 15x compression rate while boosting FPS from 144 to 237 within the 3D-GS framework.
arXiv Detail & Related papers (2023-11-28T21:39:20Z) - GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting [51.96353586773191]
We introduce textbfGS-SLAM that first utilizes 3D Gaussian representation in the Simultaneous Localization and Mapping system.
Our method utilizes a real-time differentiable splatting rendering pipeline that offers significant speedup to map optimization and RGB-D rendering.
Our method achieves competitive performance compared with existing state-of-the-art real-time methods on the Replica, TUM-RGBD datasets.
arXiv Detail & Related papers (2023-11-20T12:08:23Z) - Frequency Compensated Diffusion Model for Real-scene Dehazing [6.105813272271171]
We consider a dehazing framework based on conditional diffusion models for improved generalization to real haze.
The proposed dehazing diffusion model significantly outperforms state-of-the-art methods on real-world images.
arXiv Detail & Related papers (2023-08-21T06:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.