Related papers: Predicting Depth from Semantic Segmentation using Game Engine Dataset

Predicting Depth from Semantic Segmentation using Game Engine Dataset

URL: http://arxiv.org/abs/2106.15257v1
Date: Sat, 12 Jun 2021 10:15:40 GMT
Title: Predicting Depth from Semantic Segmentation using Game Engine Dataset
Authors: Mohammad Amin Kashi
Abstract summary: This thesis investigates the relation of perception of objects and depth estimation convolutional neural networks. We developed new network structures based on a simple depth estimation network that only used a single image at its input. Results show that our novel structures can improve the performance of depth estimation by 52% of relative error of distance.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Depth perception is fundamental for robots to understand the surrounding environment. As the view of cognitive neuroscience, visual depth perception methods are divided into three categories, namely binocular, active, and pictorial. The first two categories have been studied for decades in detail. However, research for the exploration of the third category is still in its infancy and has got momentum by the advent of deep learning methods in recent years. In cognitive neuroscience, it is known that pictorial depth perception mechanisms are dependent on the perception of seen objects. Inspired by this fact, in this thesis, we investigated the relation of perception of objects and depth estimation convolutional neural networks. For this purpose, we developed new network structures based on a simple depth estimation network that only used a single image at its input. Our proposed structures use both an image and a semantic label of the image as their input. We used semantic labels as the output of object perception. The obtained results of performance comparison between the developed network and original network showed that our novel structures can improve the performance of depth estimation by 52\% of relative error of distance in the examined cases. Most of the experimental studies were carried out on synthetic datasets that were generated by game engines to isolate the performance comparison from the effect of inaccurate depth and semantic labels of non-synthetic datasets. It is shown that particular synthetic datasets may be used for training of depth networks in cases that an appropriate dataset is not available. Furthermore, we showed that in these cases, usage of semantic labels improves the robustness of the network against domain shift from synthetic training data to non-synthetic test data.

Related papers

An unsupervised tour through the hidden pathways of deep neural networks [6.063903439185316]
This thesis focuses on characterizing semantic content of hidden representations with unsupervised learning tools.<n>In Chapter 3, we study the evolution of the probability density across the hidden layers in some state-of-the-art deep neural networks.<n>In Chapter 4, we study the problem of generalization in deep neural networks.
arXiv Detail & Related papers (2025-10-24T15:50:31Z)
Semantic Depth Matters: Explaining Errors of Deep Vision Networks through Perceived Class Similarities [0.0]
We introduce a novel framework that investigates the relationship between the semantic hierarchy depth perceived by a network and its real-data misclassification patterns. We propose a graph-based visualization of model semantic relationships and misperceptions. Our approach reveals that deep vision networks encode specific semantic hierarchies and that high semantic depth improves the compliance between perceived class similarities and actual errors.
arXiv Detail & Related papers (2025-04-14T07:44:34Z)
Designing Deep Networks for Scene Recognition [3.493180651702109]
We conduct extensive experiments to demonstrate the widely accepted principles in network design may result in dramatic performance differences when the data is altered. This paper presents a novel network design methodology: data-oriented network design. We propose a Deep-Narrow Network and Dilated Pooling module, which improved the scene recognition performance using less than half of the computational resources.
arXiv Detail & Related papers (2023-03-13T18:28:06Z)
Advancing 3D finger knuckle recognition via deep feature learning [51.871256510747465]
Contactless 3D finger knuckle patterns have emerged as an effective biometric identifier due to its discriminativeness, visibility from a distance, and convenience. Recent research has developed a deep feature collaboration network which simultaneously incorporates intermediate features from deep neural networks with multiple scales. This paper advances this approach by investigating the possibility of learning a discriminative feature vector with the least possible dimension for representing 3D finger knuckle images.
arXiv Detail & Related papers (2023-01-07T20:55:16Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
X-Distill: Improving Self-Supervised Monocular Depth via Cross-Task Distillation [69.9604394044652]
We propose a novel method to improve the self-supervised training of monocular depth via cross-task knowledge distillation. During training, we utilize a pretrained semantic segmentation teacher network and transfer its semantic knowledge to the depth network. We extensively evaluate the efficacy of our proposed approach on the KITTI benchmark and compare it with the latest state of the art.
arXiv Detail & Related papers (2021-10-24T19:47:14Z)
On the Sins of Image Synthesis Loss for Self-supervised Depth Estimation [60.780823530087446]
We show that improvements in image synthesis do not necessitate improvement in depth estimation. We attribute this diverging phenomenon to aleatoric uncertainties, which originate from data. This observed divergence has not been previously reported or studied in depth.
arXiv Detail & Related papers (2021-09-13T17:57:24Z)
Self-Guided Instance-Aware Network for Depth Completion and Enhancement [6.319531161477912]
Existing methods directly interpolate the missing depth measurements based on pixel-wise image content and the corresponding neighboring depth values. We propose a novel self-guided instance-aware network (SG-IANet) that utilize self-guided mechanism to extract instance-level features that is needed for depth restoration.
arXiv Detail & Related papers (2021-05-25T19:41:38Z)
SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from Monocular images [94.36401543589523]
We introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks. We then propose a Semantic Object and Depth Estimation Network (SOSD-Net) based on the objectness assumption. To the best of our knowledge, SOSD-Net is the first network that exploits the geometry constraint for simultaneous monocular depth estimation and semantic segmentation.
arXiv Detail & Related papers (2021-01-19T02:41:03Z)
Learning a Geometric Representation for Data-Efficient Depth Estimation via Gradient Field and Contrastive Loss [29.798579906253696]
We propose a gradient-based self-supervised learning algorithm with momentum contrastive loss to help ConvNets extract the geometric information with unlabeled images. Our method outperforms the previous state-of-the-art self-supervised learning algorithms and shows the efficiency of labeled data in triple.
arXiv Detail & Related papers (2020-11-06T06:47:19Z)
Implicit Saliency in Deep Neural Networks [15.510581400494207]
In this paper, we show that existing recognition and localization deep architectures are capable of predicting the human visual saliency. We calculate this implicit saliency using expectancy-mismatch hypothesis in an unsupervised fashion. Our experiments show that extracting saliency in this fashion provides comparable performance when measured against the state-of-art supervised algorithms.
arXiv Detail & Related papers (2020-08-04T23:14:24Z)
Seeing eye-to-eye? A comparison of object recognition performance in humans and deep convolutional neural networks under image manipulation [0.0]
This study aims towards a behavioral comparison of visual core object recognition performance between humans and feedforward neural networks. Analyses of accuracy revealed that humans not only outperform DCNNs on all conditions, but also display significantly greater robustness towards shape and most notably color alterations.
arXiv Detail & Related papers (2020-07-13T10:26:30Z)
Ventral-Dorsal Neural Networks: Object Detection via Selective Attention [51.79577908317031]
We propose a new framework called Ventral-Dorsal Networks (VDNets) Inspired by the structure of the human visual system, we propose the integration of a "Ventral Network" and a "Dorsal Network" Our experimental results reveal that the proposed method outperforms state-of-the-art object detection approaches.
arXiv Detail & Related papers (2020-05-15T23:57:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.