DeepDR: Deep Structure-Aware RGB-D Inpainting for Diminished Reality
- URL: http://arxiv.org/abs/2312.00532v1
- Date: Fri, 1 Dec 2023 12:12:58 GMT
- Title: DeepDR: Deep Structure-Aware RGB-D Inpainting for Diminished Reality
- Authors: Christina Gsaxner, Shohei Mori, Dieter Schmalstieg, Jan Egger, Gerhard
Paar, Werner Bailer and Denis Kalkofen
- Abstract summary: Diminished reality (DR) refers to the removal of real objects from the environment by virtually replacing them with their background.
Recent deep learning-based inpainting is promising, but the DR use case is complicated by the need to generate coherent structure and 3D geometry.
In this paper, we propose a first RGB-D inpainting framework fulfilling all requirements of DR: Plausible image and geometry inpainting with coherent structure, running at real-time frame rates, with minimal temporal artifacts.
- Score: 12.84124441493612
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diminished reality (DR) refers to the removal of real objects from the
environment by virtually replacing them with their background. Modern DR
frameworks use inpainting to hallucinate unobserved regions. While recent deep
learning-based inpainting is promising, the DR use case is complicated by the
need to generate coherent structure and 3D geometry (i.e., depth), in
particular for advanced applications, such as 3D scene editing. In this paper,
we propose DeepDR, a first RGB-D inpainting framework fulfilling all
requirements of DR: Plausible image and geometry inpainting with coherent
structure, running at real-time frame rates, with minimal temporal artifacts.
Our structure-aware generative network allows us to explicitly condition color
and depth outputs on the scene semantics, overcoming the difficulty of
reconstructing sharp and consistent boundaries in regions with complex
backgrounds. Experimental results show that the proposed framework can
outperform related work qualitatively and quantitatively.
Related papers
- NeSLAM: Neural Implicit Mapping and Self-Supervised Feature Tracking With Depth Completion and Denoising [23.876281686625134]
We present NeSLAM, a framework that achieves accurate and dense depth estimation, robust camera tracking, and realistic synthesis of novel views.
Experiments on various indoor datasets demonstrate the effectiveness and accuracy of the system in reconstruction, tracking quality, and novel view synthesis.
arXiv Detail & Related papers (2024-03-29T07:59:37Z) - ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic
Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames.
Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z) - R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras [106.52409577316389]
R3D3 is a multi-camera system for dense 3D reconstruction and ego-motion estimation.
Our approach exploits spatial-temporal information from multiple cameras, and monocular depth refinement.
We show that this design enables a dense, consistent 3D reconstruction of challenging, dynamic outdoor environments.
arXiv Detail & Related papers (2023-08-28T17:13:49Z) - O$^2$-Recon: Completing 3D Reconstruction of Occluded Objects in the Scene with a Pre-trained 2D Diffusion Model [28.372289119872764]
Occlusion is a common issue in 3D reconstruction from RGB-D videos, often blocking the complete reconstruction of objects.
We propose a novel framework, empowered by a 2D diffusion-based in-painting model, to reconstruct complete surfaces for the hidden parts of objects.
arXiv Detail & Related papers (2023-08-18T14:38:31Z) - NeuRIS: Neural Reconstruction of Indoor Scenes Using Normal Priors [84.66706400428303]
We propose a new method, named NeuRIS, for high quality reconstruction of indoor scenes.
NeuRIS integrates estimated normal of indoor scenes as a prior in a neural rendering framework.
Experiments show that NeuRIS significantly outperforms the state-of-the-art methods in terms of reconstruction quality.
arXiv Detail & Related papers (2022-06-27T19:22:03Z) - PanoDR: Spherical Panorama Diminished Reality for Indoor Scenes [0.0]
Diminished Reality (DR) fulfills the requirement of such applications, to remove existing objects in the scene.
To preserve the reality' in indoor (re-)planning applications, the scene's structure preservation is crucial.
We propose a model that initially predicts the structure of an indoor scene and then uses it to guide the reconstruction of an empty -- background only -- representation of the same scene.
arXiv Detail & Related papers (2021-06-01T12:56:53Z) - S2R-DepthNet: Learning a Generalizable Depth-specific Structural
Representation [63.58891781246175]
Human can infer the 3D geometry of a scene from a sketch instead of a realistic image, which indicates that the spatial structure plays a fundamental role in understanding the depth of scenes.
We are the first to explore the learning of a depth-specific structural representation, which captures the essential feature for depth estimation and ignores irrelevant style information.
Our S2R-DepthNet can be well generalized to unseen real-world data directly even though it is only trained on synthetic data.
arXiv Detail & Related papers (2021-04-02T03:55:41Z) - SCFusion: Real-time Incremental Scene Reconstruction with Semantic
Completion [86.77318031029404]
We propose a framework that performs scene reconstruction and semantic scene completion jointly in an incremental and real-time manner.
Our framework relies on a novel neural architecture designed to process occupancy maps and leverages voxel states to accurately and efficiently fuse semantic completion with the 3D global model.
arXiv Detail & Related papers (2020-10-26T15:31:52Z) - Dynamic Object Removal and Spatio-Temporal RGB-D Inpainting via
Geometry-Aware Adversarial Learning [9.150245363036165]
Dynamic objects have a significant impact on the robot's perception of the environment.
In this work, we address this problem by synthesizing plausible color, texture and geometry in regions occluded by dynamic objects.
We optimize our architecture using adversarial training to synthesize fine realistic textures which enables it to hallucinate color and depth structure in occluded regions online.
arXiv Detail & Related papers (2020-08-12T01:23:21Z) - 3D Photography using Context-aware Layered Depth Inpainting [50.66235795163143]
We propose a method for converting a single RGB-D input image into a 3D photo.
A learning-based inpainting model synthesizes new local color-and-depth content into the occluded region.
The resulting 3D photos can be efficiently rendered with motion parallax.
arXiv Detail & Related papers (2020-04-09T17:59:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.