IMFine: 3D Inpainting via Geometry-guided Multi-view Refinement
- URL: http://arxiv.org/abs/2503.04501v1
- Date: Thu, 06 Mar 2025 14:50:17 GMT
- Title: IMFine: 3D Inpainting via Geometry-guided Multi-view Refinement
- Authors: Zhihao Shi, Dong Huo, Yuhongze Zhou, Kejia Yin, Yan Min, Juwei Lu, Xinxin Zuo,
- Abstract summary: We introduce a novel approach that produces inpainted 3D scenes with consistent visual quality and coherent underlying geometry.<n>Specifically, we propose a robust 3D inpainting pipeline that incorporates geometric priors and a multi-view refinement network trained via test-time adaptation.<n>We develop a novel inpainting mask detection technique to derive targeted inpainting masks from object masks, boosting the performance in handling unconstrained scenes.
- Score: 15.206470606085341
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current 3D inpainting and object removal methods are largely limited to front-facing scenes, facing substantial challenges when applied to diverse, "unconstrained" scenes where the camera orientation and trajectory are unrestricted. To bridge this gap, we introduce a novel approach that produces inpainted 3D scenes with consistent visual quality and coherent underlying geometry across both front-facing and unconstrained scenes. Specifically, we propose a robust 3D inpainting pipeline that incorporates geometric priors and a multi-view refinement network trained via test-time adaptation, building on a pre-trained image inpainting model. Additionally, we develop a novel inpainting mask detection technique to derive targeted inpainting masks from object masks, boosting the performance in handling unconstrained scenes. To validate the efficacy of our approach, we create a challenging and diverse benchmark that spans a wide range of scenes. Comprehensive experiments demonstrate that our proposed method substantially outperforms existing state-of-the-art approaches.
Related papers
- Visibility-Uncertainty-guided 3D Gaussian Inpainting via Scene Conceptional Learning [63.94919846010485]
3D Gaussian inpainting (3DGI) is challenging in effectively leveraging complementary visual and semantic cues from multiple input views.
We propose a method that measures the visibility uncertainties of 3D points across different input views and uses them to guide 3DGI.
We build a novel 3DGI framework, VISTA, by integrating VISibility-uncerTainty-guided 3DGI with scene conceptuAl learning.
arXiv Detail & Related papers (2025-04-23T06:21:11Z) - Geometry-Aware Diffusion Models for Multiview Scene Inpainting [24.963896970130065]
We focus on 3D scene inpainting, where parts of an input image set, captured from different viewpoints, are masked out.<n>Most recent work addresses this challenge by combining generative models with a 3D radiance field to fuse information across viewpoints.<n>We introduce a geometry-aware conditional generative model, capable of inpainting consistent images based on both geometric and appearance cues from reference images.
arXiv Detail & Related papers (2025-02-18T23:30:10Z) - HOMER: Homography-Based Efficient Multi-view 3D Object Removal [25.832938786291358]
3D object removal is an important sub-task in 3D scene editing, with broad applications in scene understanding, augmented reality, and robotics.
Existing methods struggle to achieve a desirable balance among consistency, usability, and computational efficiency in multi-view settings.
We propose a novel pipeline that improves the quality and efficiency of multi-view object mask generation and inpainting.
arXiv Detail & Related papers (2025-01-29T13:12:06Z) - RefFusion: Reference Adapted Diffusion Models for 3D Scene Inpainting [63.567363455092234]
RefFusion is a novel 3D inpainting method based on a multi-scale personalization of an image inpainting diffusion model to the given reference view.
Our framework achieves state-of-the-art results for object removal while maintaining high controllability.
arXiv Detail & Related papers (2024-04-16T17:50:02Z) - Neural Point-based Volumetric Avatar: Surface-guided Neural Points for
Efficient and Photorealistic Volumetric Head Avatar [62.87222308616711]
We propose fullname (name), a method that adopts the neural point representation and the neural volume rendering process.
Specifically, the neural points are strategically constrained around the surface of the target expression via a high-resolution UV displacement map.
By design, our name is better equipped to handle topologically changing regions and thin structures while also ensuring accurate expression control when animating avatars.
arXiv Detail & Related papers (2023-07-11T03:40:10Z) - MM-3DScene: 3D Scene Understanding by Customizing Masked Modeling with
Informative-Preserved Reconstruction and Self-Distilled Consistency [120.9499803967496]
We propose a novel informative-preserved reconstruction, which explores local statistics to discover and preserve the representative structured points.
Our method can concentrate on modeling regional geometry and enjoy less ambiguity for masked reconstruction.
By combining informative-preserved reconstruction on masked areas and consistency self-distillation from unmasked areas, a unified framework called MM-3DScene is yielded.
arXiv Detail & Related papers (2022-12-20T01:53:40Z) - SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with Neural
Radiance Fields [26.296017756560467]
In 3D, solutions must be consistent across multiple views and geometrically valid.
We propose a novel 3D inpainting method that addresses these challenges.
We first demonstrate the superiority of our approach on multiview segmentation, comparing to NeRFbased methods and 2D segmentation approaches.
arXiv Detail & Related papers (2022-11-22T13:14:50Z) - Non-Deterministic Face Mask Removal Based On 3D Priors [3.8502825594372703]
The proposed approach integrates a multi-task 3D face reconstruction module with a face inpainting module.
By gradually controlling the 3D shape parameters, our method generates high-quality dynamic inpainting results with different expressions and mouth movements.
arXiv Detail & Related papers (2022-02-20T16:27:44Z) - DeepMultiCap: Performance Capture of Multiple Characters Using Sparse
Multiview Cameras [63.186486240525554]
DeepMultiCap is a novel method for multi-person performance capture using sparse multi-view cameras.
Our method can capture time varying surface details without the need of using pre-scanned template models.
arXiv Detail & Related papers (2021-05-01T14:32:13Z) - 3D Photography using Context-aware Layered Depth Inpainting [50.66235795163143]
We propose a method for converting a single RGB-D input image into a 3D photo.
A learning-based inpainting model synthesizes new local color-and-depth content into the occluded region.
The resulting 3D photos can be efficiently rendered with motion parallax.
arXiv Detail & Related papers (2020-04-09T17:59:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.