Robust Neural Rendering in the Wild with Asymmetric Dual 3D Gaussian Splatting
- URL: http://arxiv.org/abs/2506.03538v1
- Date: Wed, 04 Jun 2025 03:40:33 GMT
- Title: Robust Neural Rendering in the Wild with Asymmetric Dual 3D Gaussian Splatting
- Authors: Chengqi Li, Zhihao Shi, Yangdi Lu, Wenbo He, Xiangyu Xu,
- Abstract summary: 3D reconstruction from in-the-wild images remains a challenging task due to inconsistent lighting conditions and transient distractors.<n>Existing methods typically rely on strategies to handle the low-quality training data, which often struggle to produce stable and consistent reconstructions.<n>We propose Asymmetric Dual 3DGS, a novel framework that leverages the convergence nature of these artifacts.
- Score: 30.566082713819117
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D reconstruction from in-the-wild images remains a challenging task due to inconsistent lighting conditions and transient distractors. Existing methods typically rely on heuristic strategies to handle the low-quality training data, which often struggle to produce stable and consistent reconstructions, frequently resulting in visual artifacts. In this work, we propose Asymmetric Dual 3DGS, a novel framework that leverages the stochastic nature of these artifacts: they tend to vary across different training runs due to minor randomness. Specifically, our method trains two 3D Gaussian Splatting (3DGS) models in parallel, enforcing a consistency constraint that encourages convergence on reliable scene geometry while suppressing inconsistent artifacts. To prevent the two models from collapsing into similar failure modes due to confirmation bias, we introduce a divergent masking strategy that applies two complementary masks: a multi-cue adaptive mask and a self-supervised soft mask, which leads to an asymmetric training process of the two models, reducing shared error modes. In addition, to improve the efficiency of model training, we introduce a lightweight variant called Dynamic EMA Proxy, which replaces one of the two models with a dynamically updated Exponential Moving Average (EMA) proxy, and employs an alternating masking strategy to preserve divergence. Extensive experiments on challenging real-world datasets demonstrate that our method consistently outperforms existing approaches while achieving high efficiency. Codes and trained models will be released.
Related papers
- ArtiFixer: Enhancing and Extending 3D Reconstruction with Auto-Regressive Diffusion Models [27.324967736816337]
Per-scene optimization methods such as 3D Gaussian Splatting provide state-of-the-art novel view synthesis quality but extrapolate poorly to under-observed areas.<n>We propose a two-stage pipeline that leverages two key insights.<n>First, we train a powerful bidirectional generative model with a novel opacity mixing strategy.<n>Second, we distill it into a causal auto-regressive model that generates hundreds of frames in a single pass.
arXiv Detail & Related papers (2026-02-28T06:22:40Z) - RobustSplat++: Decoupling Densification, Dynamics, and Illumination for In-the-Wild 3DGS [85.90134051583368]
3D Gaussian Splatting (3DGS) has gained significant attention for its real-time, photo-realistic rendering in novel-view synthesis and 3D modeling.<n>Existing methods struggle with accurately modeling in-the-wild scenes affected by transient objects and illuminations.<n>We propose RobustSplat++, a robust solution based on several critical designs.
arXiv Detail & Related papers (2025-12-04T14:05:09Z) - OracleGS: Grounding Generative Priors for Sparse-View Gaussian Splatting [78.70702961852119]
OracleGS reconciles generative completeness with regressive fidelity for sparse view Gaussian Splatting.<n>Our approach conditions the powerful generative prior on multi-view geometric evidence, filtering hallucinatory artifacts while preserving plausible completions in under-constrained regions.
arXiv Detail & Related papers (2025-09-27T11:19:32Z) - SPFSplatV2: Efficient Self-Supervised Pose-Free 3D Gaussian Splatting from Sparse Views [18.814209805277503]
SPFSplatV2, an efficient feed-forward framework for 3D Gaussian splatting from sparse multi-view images, is presented.<n>Method achieves state-of-the-art performance in both in-domain and out-of-domain novel view synthesis.
arXiv Detail & Related papers (2025-09-21T21:37:56Z) - A Controllable 3D Deepfake Generation Framework with Gaussian Splatting [6.969908558294805]
We propose a novel 3D deepfake generation framework based on 3D Gaussian Splatting.<n>It enables realistic, identity-preserving face swapping and reenactment in a fully controllable 3D space.<n>Our approach bridges the gap between 3D modeling and deepfake synthesis, enabling new directions for scene-aware, controllable, and immersive visual forgeries.
arXiv Detail & Related papers (2025-09-15T06:34:17Z) - RobustSplat: Decoupling Densification and Dynamics for Transient-Free 3DGS [79.15416002879239]
3D Gaussian Splatting has gained significant attention for its real-time, photo-realistic rendering in novel-view synthesis and 3D modeling.<n>Existing methods struggle with accurately modeling scenes affected by transient objects, leading to artifacts in the rendered images.<n>We propose RobustSplat, a robust solution based on two critical designs.
arXiv Detail & Related papers (2025-06-03T11:13:48Z) - Learning to Align and Refine: A Foundation-to-Diffusion Framework for Occlusion-Robust Two-Hand Reconstruction [50.952228546326516]
Two-hand reconstruction from monocular images faces persistent challenges due to complex and dynamic hand postures.<n>Existing approaches struggle with such alignment issues, often resulting in misalignment and penetration artifacts.<n>We propose a dual-stage Foundation-to-Diffusion framework that precisely align 2D prior guidance from vision foundation models.
arXiv Detail & Related papers (2025-03-22T14:42:27Z) - A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision [65.33043028101471]
We introduce a diffusion model for Gaussian Splats, SplatDiffusion, to enable generation of three-dimensional structures from single images.<n>Existing methods rely on deterministic, feed-forward predictions, which limit their ability to handle the inherent ambiguity of 3D inference from 2D data.
arXiv Detail & Related papers (2024-12-01T00:29:57Z) - RIGI: Rectifying Image-to-3D Generation Inconsistency via Uncertainty-aware Learning [27.4552892119823]
inconsistencies in multi-view snapshots frequently introduce noise and artifacts along object boundaries, undermining the 3D reconstruction process.<n>We leverage 3D Gaussian Splatting (3DGS) for 3D reconstruction, and explicitly integrate uncertainty-aware learning into the reconstruction process.<n>We apply adaptive pixel-wise loss weighting to regularize the models, reducing reconstruction intensity in high-uncertainty regions.
arXiv Detail & Related papers (2024-11-28T02:19:28Z) - DeSiRe-GS: 4D Street Gaussians for Static-Dynamic Decomposition and Surface Reconstruction for Urban Driving Scenes [71.61083731844282]
We present DeSiRe-GS, a self-supervised gaussian splatting representation.
It enables effective static-dynamic decomposition and high-fidelity surface reconstruction in complex driving scenarios.
arXiv Detail & Related papers (2024-11-18T05:49:16Z) - PLGS: Robust Panoptic Lifting with 3D Gaussian Splatting [16.333566122541022]
We propose a new method called PLGS that enables 3DGS to generate consistent panoptic segmentation masks from noisy 2D segmentation masks.<n>Our method outperforms previous state-of-the-art methods in terms of both segmentation quality and speed.
arXiv Detail & Related papers (2024-10-23T02:05:05Z) - LUDVIG: Learning-free Uplifting of 2D Visual features to Gaussian Splatting scenes [39.687526103092445]
We introduce a novel method to uplift 2D image features into Gaussian Splatting representations of 3D scenes.<n>We obtain competitive segmentation results using generic DINOv2 features, despite DINOv2 not being trained on millions of annotated segmentation masks like SAM.<n>When applied to CLIP features, our method demonstrates strong performance in open-vocabulary object localization tasks.
arXiv Detail & Related papers (2024-10-18T13:44:29Z) - StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D [88.66678730537777]
We present StableDreamer, a methodology incorporating three advances.
First, we formalize the equivalence of the SDS generative prior and a simple supervised L2 reconstruction loss.
Second, our analysis shows that while image-space diffusion contributes to geometric precision, latent-space diffusion is crucial for vivid color rendition.
arXiv Detail & Related papers (2023-12-02T02:27:58Z) - Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision.
This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z) - Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online
Adaptation [87.85851771425325]
We consider a new problem of adapting a human mesh reconstruction model to out-of-domain streaming videos.
We tackle this problem through online adaptation, gradually correcting the model bias during testing.
We propose the Dynamic Bilevel Online Adaptation algorithm (DynaBOA)
arXiv Detail & Related papers (2021-11-07T07:23:24Z) - Recurrent Multi-view Alignment Network for Unsupervised Surface
Registration [79.72086524370819]
Learning non-rigid registration in an end-to-end manner is challenging due to the inherent high degrees of freedom and the lack of labeled training data.
We propose to represent the non-rigid transformation with a point-wise combination of several rigid transformations.
We also introduce a differentiable loss function that measures the 3D shape similarity on the projected multi-view 2D depth images.
arXiv Detail & Related papers (2020-11-24T14:22:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.