Predicting Depth from Semantic Segmentation using Game Engine Dataset
- URL: http://arxiv.org/abs/2106.15257v1
- Date: Sat, 12 Jun 2021 10:15:40 GMT
- Title: Predicting Depth from Semantic Segmentation using Game Engine Dataset
- Authors: Mohammad Amin Kashi
- Abstract summary: This thesis investigates the relation of perception of objects and depth estimation convolutional neural networks.
We developed new network structures based on a simple depth estimation network that only used a single image at its input.
Results show that our novel structures can improve the performance of depth estimation by 52% of relative error of distance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Depth perception is fundamental for robots to understand the surrounding
environment. As the view of cognitive neuroscience, visual depth perception
methods are divided into three categories, namely binocular, active, and
pictorial. The first two categories have been studied for decades in detail.
However, research for the exploration of the third category is still in its
infancy and has got momentum by the advent of deep learning methods in recent
years. In cognitive neuroscience, it is known that pictorial depth perception
mechanisms are dependent on the perception of seen objects. Inspired by this
fact, in this thesis, we investigated the relation of perception of objects and
depth estimation convolutional neural networks. For this purpose, we developed
new network structures based on a simple depth estimation network that only
used a single image at its input. Our proposed structures use both an image and
a semantic label of the image as their input. We used semantic labels as the
output of object perception. The obtained results of performance comparison
between the developed network and original network showed that our novel
structures can improve the performance of depth estimation by 52\% of relative
error of distance in the examined cases. Most of the experimental studies were
carried out on synthetic datasets that were generated by game engines to
isolate the performance comparison from the effect of inaccurate depth and
semantic labels of non-synthetic datasets. It is shown that particular
synthetic datasets may be used for training of depth networks in cases that an
appropriate dataset is not available. Furthermore, we showed that in these
cases, usage of semantic labels improves the robustness of the network against
domain shift from synthetic training data to non-synthetic test data.
Related papers
- Designing Deep Networks for Scene Recognition [3.493180651702109]
We conduct extensive experiments to demonstrate the widely accepted principles in network design may result in dramatic performance differences when the data is altered.
This paper presents a novel network design methodology: data-oriented network design.
We propose a Deep-Narrow Network and Dilated Pooling module, which improved the scene recognition performance using less than half of the computational resources.
arXiv Detail & Related papers (2023-03-13T18:28:06Z) - Advancing 3D finger knuckle recognition via deep feature learning [51.871256510747465]
Contactless 3D finger knuckle patterns have emerged as an effective biometric identifier due to its discriminativeness, visibility from a distance, and convenience.
Recent research has developed a deep feature collaboration network which simultaneously incorporates intermediate features from deep neural networks with multiple scales.
This paper advances this approach by investigating the possibility of learning a discriminative feature vector with the least possible dimension for representing 3D finger knuckle images.
arXiv Detail & Related papers (2023-01-07T20:55:16Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - X-Distill: Improving Self-Supervised Monocular Depth via Cross-Task
Distillation [69.9604394044652]
We propose a novel method to improve the self-supervised training of monocular depth via cross-task knowledge distillation.
During training, we utilize a pretrained semantic segmentation teacher network and transfer its semantic knowledge to the depth network.
We extensively evaluate the efficacy of our proposed approach on the KITTI benchmark and compare it with the latest state of the art.
arXiv Detail & Related papers (2021-10-24T19:47:14Z) - On the Sins of Image Synthesis Loss for Self-supervised Depth Estimation [60.780823530087446]
We show that improvements in image synthesis do not necessitate improvement in depth estimation.
We attribute this diverging phenomenon to aleatoric uncertainties, which originate from data.
This observed divergence has not been previously reported or studied in depth.
arXiv Detail & Related papers (2021-09-13T17:57:24Z) - Self-Guided Instance-Aware Network for Depth Completion and Enhancement [6.319531161477912]
Existing methods directly interpolate the missing depth measurements based on pixel-wise image content and the corresponding neighboring depth values.
We propose a novel self-guided instance-aware network (SG-IANet) that utilize self-guided mechanism to extract instance-level features that is needed for depth restoration.
arXiv Detail & Related papers (2021-05-25T19:41:38Z) - SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from
Monocular images [94.36401543589523]
We introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks.
We then propose a Semantic Object and Depth Estimation Network (SOSD-Net) based on the objectness assumption.
To the best of our knowledge, SOSD-Net is the first network that exploits the geometry constraint for simultaneous monocular depth estimation and semantic segmentation.
arXiv Detail & Related papers (2021-01-19T02:41:03Z) - Learning a Geometric Representation for Data-Efficient Depth Estimation
via Gradient Field and Contrastive Loss [29.798579906253696]
We propose a gradient-based self-supervised learning algorithm with momentum contrastive loss to help ConvNets extract the geometric information with unlabeled images.
Our method outperforms the previous state-of-the-art self-supervised learning algorithms and shows the efficiency of labeled data in triple.
arXiv Detail & Related papers (2020-11-06T06:47:19Z) - Implicit Saliency in Deep Neural Networks [15.510581400494207]
In this paper, we show that existing recognition and localization deep architectures are capable of predicting the human visual saliency.
We calculate this implicit saliency using expectancy-mismatch hypothesis in an unsupervised fashion.
Our experiments show that extracting saliency in this fashion provides comparable performance when measured against the state-of-art supervised algorithms.
arXiv Detail & Related papers (2020-08-04T23:14:24Z) - Seeing eye-to-eye? A comparison of object recognition performance in
humans and deep convolutional neural networks under image manipulation [0.0]
This study aims towards a behavioral comparison of visual core object recognition performance between humans and feedforward neural networks.
Analyses of accuracy revealed that humans not only outperform DCNNs on all conditions, but also display significantly greater robustness towards shape and most notably color alterations.
arXiv Detail & Related papers (2020-07-13T10:26:30Z) - Ventral-Dorsal Neural Networks: Object Detection via Selective Attention [51.79577908317031]
We propose a new framework called Ventral-Dorsal Networks (VDNets)
Inspired by the structure of the human visual system, we propose the integration of a "Ventral Network" and a "Dorsal Network"
Our experimental results reveal that the proposed method outperforms state-of-the-art object detection approaches.
arXiv Detail & Related papers (2020-05-15T23:57:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.