Geometry-Aware Diffusion Models for Multiview Scene Inpainting
- URL: http://arxiv.org/abs/2502.13335v1
- Date: Tue, 18 Feb 2025 23:30:10 GMT
- Title: Geometry-Aware Diffusion Models for Multiview Scene Inpainting
- Authors: Ahmad Salimi, Tristan Aumentado-Armstrong, Marcus A. Brubaker, Konstantinos G. Derpanis,
- Abstract summary: We focus on 3D scene inpainting, where parts of an input image set, captured from different viewpoints, are masked out.
Most recent work addresses this challenge by combining generative models with a 3D radiance field to fuse information across viewpoints.
We introduce a geometry-aware conditional generative model, capable of inpainting consistent images based on both geometric and appearance cues from reference images.
- Score: 24.963896970130065
- License:
- Abstract: In this paper, we focus on 3D scene inpainting, where parts of an input image set, captured from different viewpoints, are masked out. The main challenge lies in generating plausible image completions that are geometrically consistent across views. Most recent work addresses this challenge by combining generative models with a 3D radiance field to fuse information across viewpoints. However, a major drawback of these methods is that they often produce blurry images due to the fusion of inconsistent cross-view images. To avoid blurry inpaintings, we eschew the use of an explicit or implicit radiance field altogether and instead fuse cross-view information in a learned space. In particular, we introduce a geometry-aware conditional generative model, capable of inpainting multi-view consistent images based on both geometric and appearance cues from reference images. A key advantage of our approach over existing methods is its unique ability to inpaint masked scenes with a limited number of views (i.e., few-view inpainting), whereas previous methods require relatively large image sets for their 3D model fitting step. Empirically, we evaluate and compare our scene-centric inpainting method on two datasets, SPIn-NeRF and NeRFiller, which contain images captured at narrow and wide baselines, respectively, and achieve state-of-the-art 3D inpainting performance on both. Additionally, we demonstrate the efficacy of our approach in the few-view setting compared to prior methods.
Related papers
- View-consistent Object Removal in Radiance Fields [14.195400035176815]
Radiance Fields (RFs) have emerged as a crucial technology for 3D scene representation.
Current methods rely on per-frame 2D image inpainting, which often fails to maintain consistency across views.
We introduce a novel RF editing pipeline that significantly enhances consistency by requiring the inpainting of only a single reference image.
arXiv Detail & Related papers (2024-08-04T17:57:23Z) - RefFusion: Reference Adapted Diffusion Models for 3D Scene Inpainting [63.567363455092234]
RefFusion is a novel 3D inpainting method based on a multi-scale personalization of an image inpainting diffusion model to the given reference view.
Our framework achieves state-of-the-art results for object removal while maintaining high controllability.
arXiv Detail & Related papers (2024-04-16T17:50:02Z) - Continuous-Multiple Image Outpainting in One-Step via Positional Query
and A Diffusion-based Approach [104.2588068730834]
This paper pushes the technical frontier of image outpainting in two directions that have not been resolved in literature.
We develop a method that does not depend on a pre-trained backbone network.
We evaluate the proposed approach (called PQDiff) on public benchmarks, demonstrating its superior performance over state-of-the-art approaches.
arXiv Detail & Related papers (2024-01-28T13:00:38Z) - NeRFiller: Completing Scenes via Generative 3D Inpainting [113.18181179986172]
We propose NeRFiller, an approach that completes missing portions of a 3D capture via generative 3D inpainting.
In contrast to related works, we focus on completing scenes rather than deleting foreground objects.
arXiv Detail & Related papers (2023-12-07T18:59:41Z) - PERF: Panoramic Neural Radiance Field from a Single Panorama [109.31072618058043]
PERF is a novel view synthesis framework that trains a panoramic neural radiance field from a single panorama.
We propose a novel collaborative RGBD inpainting method and a progressive inpainting-and-erasing method to lift up a 360-degree 2D scene to a 3D scene.
Our PERF can be widely used for real-world applications, such as panorama-to-3D, text-to-3D, and 3D scene stylization applications.
arXiv Detail & Related papers (2023-10-25T17:59:01Z) - ChromaDistill: Colorizing Monochrome Radiance Fields with Knowledge Distillation [23.968181738235266]
We present a method for colorized novel views from input grayscale multi-view images.
We propose a distillation-based method that transfers color from these networks trained on natural images to the target 3D representation.
Our method is agnostic to the underlying 3D representation and easily generalizable to NeRF and 3DGS methods.
arXiv Detail & Related papers (2023-09-14T12:30:48Z) - Reference-guided Controllable Inpainting of Neural Radiance Fields [26.296017756560467]
We focus on inpainting regions in a view-consistent and controllable manner.
We use monocular depth estimators to back-project the inpainted view to the correct 3D positions.
For non-reference disoccluded regions, we devise a method based on image inpainters to guide both the geometry and appearance.
arXiv Detail & Related papers (2023-04-19T14:11:21Z) - SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with Neural
Radiance Fields [26.296017756560467]
In 3D, solutions must be consistent across multiple views and geometrically valid.
We propose a novel 3D inpainting method that addresses these challenges.
We first demonstrate the superiority of our approach on multiview segmentation, comparing to NeRFbased methods and 2D segmentation approaches.
arXiv Detail & Related papers (2022-11-22T13:14:50Z) - Explicitly Controllable 3D-Aware Portrait Generation [42.30481422714532]
We propose a 3D portrait generation network that produces consistent portraits according to semantic parameters regarding pose, identity, expression and lighting.
Our method outperforms prior arts in extensive experiments, producing realistic portraits with vivid expression in natural lighting when viewed in free viewpoint.
arXiv Detail & Related papers (2022-09-12T17:40:08Z) - ShaRF: Shape-conditioned Radiance Fields from a Single View [54.39347002226309]
We present a method for estimating neural scenes representations of objects given only a single image.
The core of our method is the estimation of a geometric scaffold for the object.
We demonstrate in several experiments the effectiveness of our approach in both synthetic and real images.
arXiv Detail & Related papers (2021-02-17T16:40:28Z) - Weakly Supervised Learning of Multi-Object 3D Scene Decompositions Using
Deep Shape Priors [69.02332607843569]
PriSMONet is a novel approach for learning Multi-Object 3D scene decomposition and representations from single images.
A recurrent encoder regresses a latent representation of 3D shape, pose and texture of each object from an input RGB image.
We evaluate the accuracy of our model in inferring 3D scene layout, demonstrate its generative capabilities, assess its generalization to real images, and point out benefits of the learned representation.
arXiv Detail & Related papers (2020-10-08T14:49:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.