Free-form 3D Scene Inpainting with Dual-stream GAN
- URL: http://arxiv.org/abs/2212.08464v1
- Date: Fri, 16 Dec 2022 13:20:31 GMT
- Title: Free-form 3D Scene Inpainting with Dual-stream GAN
- Authors: Ru-Fen Jheng, Tsung-Han Wu, Jia-Fong Yeh, Winston H. Hsu
- Abstract summary: We present a novel task named free-form 3D scene inpainting.
Unlike scenes in previous 3D completion datasets, the proposed inpainting dataset contains large and diverse missing regions.
Our dual-stream generator, fusing both geometry and color information, produces distinct semantic boundaries.
To further enhance the details, our lightweight dual-stream discriminator regularizes the geometry and color edges of the predicted scenes to be realistic and sharp.
- Score: 20.186778638697696
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Nowadays, the need for user editing in a 3D scene has rapidly increased due
to the development of AR and VR technology. However, the existing 3D scene
completion task (and datasets) cannot suit the need because the missing regions
in scenes are generated by the sensor limitation or object occlusion. Thus, we
present a novel task named free-form 3D scene inpainting. Unlike scenes in
previous 3D completion datasets preserving most of the main structures and
hints of detailed shapes around missing regions, the proposed inpainting
dataset, FF-Matterport, contains large and diverse missing regions formed by
our free-form 3D mask generation algorithm that can mimic human drawing
trajectories in 3D space. Moreover, prior 3D completion methods cannot perform
well on this challenging yet practical task, simply interpolating nearby
geometry and color context. Thus, a tailored dual-stream GAN method is
proposed. First, our dual-stream generator, fusing both geometry and color
information, produces distinct semantic boundaries and solves the interpolation
issue. To further enhance the details, our lightweight dual-stream
discriminator regularizes the geometry and color edges of the predicted scenes
to be realistic and sharp. We conducted experiments with the proposed
FF-Matterport dataset. Qualitative and quantitative results validate the
superiority of our approach over existing scene completion methods and the
efficacy of all proposed components.
Related papers
- Behind the Veil: Enhanced Indoor 3D Scene Reconstruction with Occluded Surfaces Completion [15.444301186927142]
We present a novel indoor 3D reconstruction method with occluded surface completion, given a sequence of depth readings.
Our method tackles the task of completing the occluded scene surfaces, resulting in a complete 3D scene mesh.
We evaluate the proposed method on the 3D Completed Room Scene (3D-CRS) and iTHOR datasets.
arXiv Detail & Related papers (2024-04-03T21:18:27Z) - SceneWiz3D: Towards Text-guided 3D Scene Composition [134.71933134180782]
Existing approaches either leverage large text-to-image models to optimize a 3D representation or train 3D generators on object-centric datasets.
We introduce SceneWiz3D, a novel approach to synthesize high-fidelity 3D scenes from text.
arXiv Detail & Related papers (2023-12-13T18:59:30Z) - ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic
Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames.
Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z) - Incremental 3D Semantic Scene Graph Prediction from RGB Sequences [86.77318031029404]
We propose a real-time framework that incrementally builds a consistent 3D semantic scene graph of a scene given an RGB image sequence.
Our method consists of a novel incremental entity estimation pipeline and a scene graph prediction network.
The proposed network estimates 3D semantic scene graphs with iterative message passing using multi-view and geometric features extracted from the scene entities.
arXiv Detail & Related papers (2023-05-04T11:32:16Z) - PaintNet: Unstructured Multi-Path Learning from 3D Point Clouds for
Robotic Spray Painting [13.182797149468204]
Industrial robotic problems such as spray painting and welding require planning of multiple trajectories to solve the task.
Existing solutions make strong assumptions on the form of input surfaces and the nature of output paths.
By leveraging on recent advances in 3D deep learning, we introduce a novel framework capable of dealing with arbitrary 3D surfaces.
arXiv Detail & Related papers (2022-11-13T15:41:50Z) - CompNVS: Novel View Synthesis with Scene Completion [83.19663671794596]
We propose a generative pipeline performing on a sparse grid-based neural scene representation to complete unobserved scene parts.
We process encoded image features in 3D space with a geometry completion network and a subsequent texture inpainting network to extrapolate the missing area.
Photorealistic image sequences can be finally obtained via consistency-relevant differentiable rendering.
arXiv Detail & Related papers (2022-07-23T09:03:13Z) - Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic
Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space.
A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space.
We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z) - 3D Sketch-aware Semantic Scene Completion via Semi-supervised Structure
Prior [50.73148041205675]
The goal of the Semantic Scene Completion (SSC) task is to simultaneously predict a completed 3D voxel representation of volumetric occupancy and semantic labels of objects in the scene from a single-view observation.
We propose to devise a new geometry-based strategy to embed depth information with low-resolution voxel representation.
Our proposed geometric embedding works better than the depth feature learning from habitual SSC frameworks.
arXiv Detail & Related papers (2020-03-31T09:33:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.