Unpaired Single-Image Depth Synthesis with cycle-consistent Wasserstein
GANs
- URL: http://arxiv.org/abs/2103.16938v1
- Date: Wed, 31 Mar 2021 09:43:38 GMT
- Title: Unpaired Single-Image Depth Synthesis with cycle-consistent Wasserstein
GANs
- Authors: Christoph Angermann and Ad\'ela Moravov\'a and Markus Haltmeier and
Steinbj\"orn J\'onsson and Christian Laubichler
- Abstract summary: Real-time estimation of actual environment depth is an essential module for various autonomous system tasks.
In this study, latest advancements in the field of generative neural networks are leveraged to fully unsupervised single-image depth synthesis.
- Score: 1.0499611180329802
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-time estimation of actual environment depth is an essential module for
various autonomous system tasks such as localization, obstacle detection and
pose estimation. During the last decade of machine learning, extensive
deployment of deep learning methods to computer vision tasks yielded successful
approaches for realistic depth synthesis out of a simple RGB modality. While
most of these models rest on paired depth data or availability of video
sequences and stereo images, there is a lack of methods facing single-image
depth synthesis in an unsupervised manner. Therefore, in this study, latest
advancements in the field of generative neural networks are leveraged to fully
unsupervised single-image depth synthesis. To be more exact, two
cycle-consistent generators for RGB-to-depth and depth-to-RGB transfer are
implemented and simultaneously optimized using the Wasserstein-1 distance. To
ensure plausibility of the proposed method, we apply the models to a self
acquised industrial data set as well as to the renown NYU Depth v2 data set,
which allows comparison with existing approaches. The observed success in this
study suggests high potential for unpaired single-image depth estimation in
real world applications.
Related papers
- Confidence-Aware RGB-D Face Recognition via Virtual Depth Synthesis [48.59382455101753]
2D face recognition encounters challenges in unconstrained environments due to varying illumination, occlusion, and pose.
Recent studies focus on RGB-D face recognition to improve robustness by incorporating depth information.
In this work, we first construct a diverse depth dataset generated by 3D Morphable Models for depth model pre-training.
Then, we propose a domain-independent pre-training framework that utilizes readily available pre-trained RGB and depth models to separately perform face recognition without needing additional paired data for retraining.
arXiv Detail & Related papers (2024-03-11T09:12:24Z) - RBF Weighted Hyper-Involution for RGB-D Object Detection [0.0]
We propose a real-time and two stream RGBD object detection model.
The proposed model consists of two new components: a depth guided hyper-involution that adapts dynamically based on the spatial interaction pattern in the raw depth map and an up-sampling based trainable fusion layer.
We show that the proposed model outperforms other RGB-D based object detection models on NYU Depth v2 dataset and achieves comparable (second best) results on SUN RGB-D.
arXiv Detail & Related papers (2023-09-30T11:25:34Z) - Symmetric Uncertainty-Aware Feature Transmission for Depth
Super-Resolution [52.582632746409665]
We propose a novel Symmetric Uncertainty-aware Feature Transmission (SUFT) for color-guided DSR.
Our method achieves superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-06-01T06:35:59Z) - DeepRM: Deep Recurrent Matching for 6D Pose Refinement [77.34726150561087]
DeepRM is a novel recurrent network architecture for 6D pose refinement.
The architecture incorporates LSTM units to propagate information through each refinement step.
DeepRM achieves state-of-the-art performance on two widely accepted challenging datasets.
arXiv Detail & Related papers (2022-05-28T16:18:08Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - Unsupervised Single-shot Depth Estimation using Perceptual
Reconstruction [0.0]
This study presents the most recent advances in the field of generative neural networks, leveraging them to perform fully unsupervised single-shot depth synthesis.
Two generators for RGB-to-depth and depth-to-RGB transfer are implemented and simultaneously optimized using the Wasserstein-1 distance and a novel perceptual reconstruction term.
The success observed in this study suggests the great potential for unsupervised single-shot depth estimation in real-world applications.
arXiv Detail & Related papers (2022-01-28T15:11:34Z) - Self-Guided Instance-Aware Network for Depth Completion and Enhancement [6.319531161477912]
Existing methods directly interpolate the missing depth measurements based on pixel-wise image content and the corresponding neighboring depth values.
We propose a novel self-guided instance-aware network (SG-IANet) that utilize self-guided mechanism to extract instance-level features that is needed for depth restoration.
arXiv Detail & Related papers (2021-05-25T19:41:38Z) - Sparse Auxiliary Networks for Unified Monocular Depth Prediction and
Completion [56.85837052421469]
Estimating scene geometry from data obtained with cost-effective sensors is key for robots and self-driving cars.
In this paper, we study the problem of predicting dense depth from a single RGB image with optional sparse measurements from low-cost active depth sensors.
We introduce Sparse Networks (SANs), a new module enabling monodepth networks to perform both the tasks of depth prediction and completion.
arXiv Detail & Related papers (2021-03-30T21:22:26Z) - A Single Stream Network for Robust and Real-time RGB-D Salient Object
Detection [89.88222217065858]
We design a single stream network to use the depth map to guide early fusion and middle fusion between RGB and depth.
This model is 55.5% lighter than the current lightest model and runs at a real-time speed of 32 FPS when processing a $384 times 384$ image.
arXiv Detail & Related papers (2020-07-14T04:40:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.