RGB-D Image Inpainting Using Generative Adversarial Network with a Late
Fusion Approach
- URL: http://arxiv.org/abs/2110.07413v1
- Date: Thu, 14 Oct 2021 14:44:01 GMT
- Title: RGB-D Image Inpainting Using Generative Adversarial Network with a Late
Fusion Approach
- Authors: Ryo Fujii, Ryo Hachiuma, Hideo Saito
- Abstract summary: Diminished reality is a technology that aims to remove objects from video images and fills in the missing region with plausible pixels.
We propose an RGB-D image inpainting method using generative adversarial network.
- Score: 14.06830052027649
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diminished reality is a technology that aims to remove objects from video
images and fills in the missing region with plausible pixels. Most conventional
methods utilize the different cameras that capture the same scene from
different viewpoints to allow regions to be removed and restored. In this
paper, we propose an RGB-D image inpainting method using generative adversarial
network, which does not require multiple cameras. Recently, an RGB image
inpainting method has achieved outstanding results by employing a generative
adversarial network. However, RGB inpainting methods aim to restore only the
texture of the missing region and, therefore, does not recover geometric
information (i.e, 3D structure of the scene). We expand conventional image
inpainting method to RGB-D image inpainting to jointly restore the texture and
geometry of missing regions from a pair of RGB and depth images. Inspired by
other tasks that use RGB and depth images (e.g., semantic segmentation and
object detection), we propose late fusion approach that exploits the advantage
of RGB and depth information each other. The experimental results verify the
effectiveness of our proposed method.
Related papers
- Rethinking RGB Color Representation for Image Restoration Models [55.81013540537963]
We augment the representation to hold structural information of local neighborhoods at each pixel.
Substituting the underlying representation space for the per-pixel losses facilitates the training of image restoration models.
Our space consistently improves overall metrics by reconstructing both color and local structures.
arXiv Detail & Related papers (2024-02-05T06:38:39Z) - DFormer: Rethinking RGBD Representation Learning for Semantic
Segmentation [76.81628995237058]
DFormer is a novel framework to learn transferable representations for RGB-D segmentation tasks.
It pretrains the backbone using image-depth pairs from ImageNet-1K.
DFormer achieves new state-of-the-art performance on two popular RGB-D tasks.
arXiv Detail & Related papers (2023-09-18T11:09:11Z) - Generative Scene Synthesis via Incremental View Inpainting using RGBD
Diffusion Models [39.23531919945332]
In this work, we present a new solution that sequentially generates novel RGBD views along a camera trajectory.
Each rendered RGBD view is later back-projected as a partial surface and is supplemented into the intermediate mesh.
The use of intermediate mesh and camera projection helps solve the refractory problem of multi-view inconsistency.
arXiv Detail & Related papers (2022-12-12T15:50:00Z) - Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images [89.81919625224103]
Training deep models for RGB-D salient object detection (SOD) often requires a large number of labeled RGB-D images.
We present a Dual-Semi RGB-D Salient Object Detection Network (DS-Net) to leverage unlabeled RGB images for boosting RGB-D saliency detection.
arXiv Detail & Related papers (2022-01-01T03:02:27Z) - Colored Point Cloud to Image Alignment [15.828285556159026]
We introduce a differential optimization method that aligns a colored point cloud to a given color image via iterative geometric and color matching.
We find the transformation between the camera image and the point cloud colors by iterating between matching the relative location of the point cloud and matching colors.
arXiv Detail & Related papers (2021-10-07T08:12:56Z) - Semantic-embedded Unsupervised Spectral Reconstruction from Single RGB
Images in the Wild [48.44194221801609]
We propose a new lightweight and end-to-end learning-based framework to tackle this challenge.
We progressively spread the differences between input RGB images and re-projected RGB images from recovered HS images via effective camera spectral response function estimation.
Our method significantly outperforms state-of-the-art unsupervised methods and even exceeds the latest supervised method under some settings.
arXiv Detail & Related papers (2021-08-15T05:19:44Z) - Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD
Images [69.5662419067878]
Grounding referring expressions in RGBD image has been an emerging field.
We present a novel task of 3D visual grounding in single-view RGBD image where the referred objects are often only partially scanned due to occlusion.
Our approach first fuses the language and the visual features at the bottom level to generate a heatmap that localizes the relevant regions in the RGBD image.
Then our approach conducts an adaptive feature learning based on the heatmap and performs the object-level matching with another visio-linguistic fusion to finally ground the referred object.
arXiv Detail & Related papers (2021-03-14T11:18:50Z) - NormalGAN: Learning Detailed 3D Human from a Single RGB-D Image [34.79657678041356]
We propose a fast adversarial learning-based method to reconstruct the complete and detailed 3D human from a single RGB-D image.
Given a consumer RGB-D sensor, NormalGAN can generate the complete and detailed 3D human reconstruction results in 20 fps.
arXiv Detail & Related papers (2020-07-30T09:35:46Z) - Hierarchical Regression Network for Spectral Reconstruction from RGB
Images [21.551899202524904]
We propose a 4-level Hierarchical Regression Network (HRNet) with PixelShuffle layer as inter-level interaction.
We evaluate proposed HRNet with other architectures and techniques by participating in NTIRE 2020 Challenge on Spectral Reconstruction from RGB Images.
arXiv Detail & Related papers (2020-05-10T16:06:11Z) - 3D Photography using Context-aware Layered Depth Inpainting [50.66235795163143]
We propose a method for converting a single RGB-D input image into a 3D photo.
A learning-based inpainting model synthesizes new local color-and-depth content into the occluded region.
The resulting 3D photos can be efficiently rendered with motion parallax.
arXiv Detail & Related papers (2020-04-09T17:59:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.