VEIGAR: View-consistent Explicit Inpainting and Geometry Alignment for 3D object Removal
- URL: http://arxiv.org/abs/2506.15821v1
- Date: Fri, 13 Jun 2025 11:31:44 GMT
- Title: VEIGAR: View-consistent Explicit Inpainting and Geometry Alignment for 3D object Removal
- Authors: Pham Khai Nguyen Do, Bao Nguyen Tran, Nam Nguyen, Duc Dung Nguyen,
- Abstract summary: Novel View Synthesis (NVS) and 3D generation have significantly improved editing tasks.<n>To maintain cross-view consistency throughout the generative process, methods typically address this challenge using a dual-strategy framework.<n>We present VEIGAR, a computationally efficient framework that outperforms existing methods without relying on an initial reconstruction phase.
- Score: 2.8954284913103367
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in Novel View Synthesis (NVS) and 3D generation have significantly improved editing tasks, with a primary emphasis on maintaining cross-view consistency throughout the generative process. Contemporary methods typically address this challenge using a dual-strategy framework: performing consistent 2D inpainting across all views guided by embedded priors either explicitly in pixel space or implicitly in latent space; and conducting 3D reconstruction with additional consistency guidance. Previous strategies, in particular, often require an initial 3D reconstruction phase to establish geometric structure, introducing considerable computational overhead. Even with the added cost, the resulting reconstruction quality often remains suboptimal. In this paper, we present VEIGAR, a computationally efficient framework that outperforms existing methods without relying on an initial reconstruction phase. VEIGAR leverages a lightweight foundation model to reliably align priors explicitly in the pixel space. In addition, we introduce a novel supervision strategy based on scale-invariant depth loss, which removes the need for traditional scale-and-shift operations in monocular depth regularization. Through extensive experimentation, VEIGAR establishes a new state-of-the-art benchmark in reconstruction quality and cross-view consistency, while achieving a threefold reduction in training time compared to the fastest existing method, highlighting its superior balance of efficiency and effectiveness.
Related papers
- RobustGS: Unified Boosting of Feedforward 3D Gaussian Splatting under Low-Quality Conditions [67.48495052903534]
We propose a general and efficient multi-view feature enhancement module, RobustGS.<n>It substantially improves the robustness of feedforward 3DGS methods under various adverse imaging conditions.<n>The RobustGS module can be seamlessly integrated into existing pretrained pipelines in a plug-and-play manner.
arXiv Detail & Related papers (2025-08-05T04:50:29Z) - Sparse-View 3D Reconstruction: Recent Advances and Open Challenges [0.8583178253811411]
Sparse-view 3D reconstruction is essential for applications in which dense image acquisition is impractical.<n>This survey reviews the latest advances in neural implicit models and explicit point-cloud-based approaches.<n>We analyze how geometric regularization, explicit shape modeling, and generative inference are used to mitigate artifacts.
arXiv Detail & Related papers (2025-07-22T09:57:28Z) - Zero-P-to-3: Zero-Shot Partial-View Images to 3D Object [55.93553895520324]
We propose a novel training-free approach that integrates local dense observations and multi-source priors for reconstruction.<n>Our method introduces a fusion-based strategy to effectively align these priors in DDIM sampling, thereby generating multi-view consistent images to supervise invisible views.
arXiv Detail & Related papers (2025-05-29T03:51:37Z) - Mono3R: Exploiting Monocular Cues for Geometric 3D Reconstruction [11.220655907305515]
We introduce a monocular-guided refinement module that integrates monocular geometric priors into multi-view reconstruction frameworks.<n>Our method achieves substantial improvements in both mutli-view camera pose estimation and point cloud accuracy.
arXiv Detail & Related papers (2025-04-18T02:33:12Z) - FreeSplat++: Generalizable 3D Gaussian Splatting for Efficient Indoor Scene Reconstruction [50.534213038479926]
FreeSplat++ is an alternative approach to large-scale indoor whole-scene reconstruction.<n>Our method with depth-regularized per-scene fine-tuning demonstrates substantial improvements in reconstruction accuracy and a notable reduction in training time.
arXiv Detail & Related papers (2025-03-29T06:22:08Z) - Evolving High-Quality Rendering and Reconstruction in a Unified Framework with Contribution-Adaptive Regularization [27.509109317973817]
3D Gaussian Splatting (3DGS) has garnered significant attention for its high-quality rendering and fast inference speed.<n>Previous methods primarily focus on geometry regularization, with common approaches including primitive-based and dual-model frameworks.<n>We propose CarGS, a unified model leveraging-adaptive regularization to achieve simultaneous, high-quality surface reconstruction.
arXiv Detail & Related papers (2025-03-02T12:51:38Z) - T-3DGS: Removing Transient Objects for 3D Scene Reconstruction [83.05271859398779]
Transient objects in video sequences can significantly degrade the quality of 3D scene reconstructions.<n>We propose T-3DGS, a novel framework that robustly filters out transient distractors during 3D reconstruction using Gaussian Splatting.
arXiv Detail & Related papers (2024-11-29T07:45:24Z) - MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement [23.707586182294932]
Existing works in single-image human reconstruction suffer from weak generalizability due to insufficient training data or 3D inconsistencies for a lack of comprehensive multi-view knowledge.
We introduce MagicMan, a human-specific multi-view diffusion model designed to generate high-quality novel view images from a single reference image.
arXiv Detail & Related papers (2024-08-26T12:10:52Z) - SMORE: Simultaneous Map and Object REconstruction [66.66729715211642]
We present a method for dynamic surface reconstruction of large-scale urban scenes from LiDAR.<n>We take a holistic perspective and optimize a compositional model of a dynamic scene that decomposes the world into rigidly-moving objects and the background.
arXiv Detail & Related papers (2024-06-19T23:53:31Z) - GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision [49.839374549646884]
This paper presents GEOcc, a Geometric-Enhanced Occupancy network tailored for vision-only surround-view perception.<n>Our approach achieves State-Of-The-Art performance on the Occ3D-nuScenes dataset with the least image resolution needed and the most weightless image backbone.
arXiv Detail & Related papers (2024-05-17T07:31:20Z) - Improving Neural Indoor Surface Reconstruction with Mask-Guided Adaptive
Consistency Constraints [0.6749750044497732]
We propose a two-stage training process, decouple view-dependent and view-independent colors, and leverage two novel consistency constraints to enhance detail reconstruction performance without requiring extra priors.
Experiments on synthetic and real-world datasets show the capability of reducing the interference from prior estimation errors.
arXiv Detail & Related papers (2023-09-18T13:05:23Z) - Black-Box Test-Time Shape REFINEment for Single View 3D Reconstruction [57.805334118057665]
We propose REFINE, a postprocessing mesh refinement step that can be easily integrated into the pipeline of any black-box method in the literature.
At test time, REFINE optimize a network per mesh instance, to encourage consistency between the mesh and the given object view.
arXiv Detail & Related papers (2021-08-23T03:28:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.