Self-Supervised Monocular Depth Estimation with Internal Feature Fusion
- URL: http://arxiv.org/abs/2110.09482v2
- Date: Wed, 20 Oct 2021 13:20:26 GMT
- Title: Self-Supervised Monocular Depth Estimation with Internal Feature Fusion
- Authors: Hang Zhou, David Greenwood, Sarah Taylor
- Abstract summary: Self-supervised learning for depth estimation uses geometry in image sequences for supervision.
We propose a novel depth estimation networkDIFFNet, which can make use of semantic information in down and upsampling procedures.
- Score: 12.874712571149725
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Self-supervised learning for depth estimation uses geometry in image
sequences for supervision and shows promising results. Like many computer
vision tasks, depth network performance is determined by the capability to
learn accurate spatial and semantic representations from images. Therefore, it
is natural to exploit semantic segmentation networks for depth estimation. In
this work, based on a well-developed semantic segmentation network HRNet, we
propose a novel depth estimation networkDIFFNet, which can make use of semantic
information in down and upsampling procedures. By applying feature fusion and
an attention mechanism, our proposed method outperforms the state-of-the-art
monocular depth estimation methods on the KITTI benchmark. Our method also
demonstrates greater potential on higher resolution training data. We propose
an additional extended evaluation strategy by establishing a test set of
challenging cases, empirically derived from the standard benchmark.
Related papers
- ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation [62.600382533322325]
We propose a novel monocular depth estimation method called ScaleDepth.
Our method decomposes metric depth into scene scale and relative depth, and predicts them through a semantic-aware scale prediction module.
Our method achieves metric depth estimation for both indoor and outdoor scenes in a unified framework.
arXiv Detail & Related papers (2024-07-11T05:11:56Z) - Self-Supervised Learning based Depth Estimation from Monocular Images [0.0]
The goal of Monocular Depth Estimation is to predict the depth map, given a 2D monocular RGB image as input.
We plan to do intrinsic camera parameters during training and apply weather augmentations to further generalize our model.
arXiv Detail & Related papers (2023-04-14T07:14:08Z) - Single Image Depth Prediction Made Better: A Multivariate Gaussian Take [163.14849753700682]
We introduce an approach that performs continuous modeling of per-pixel depth.
Our method's accuracy (named MG) is among the top on the KITTI depth-prediction benchmark leaderboard.
arXiv Detail & Related papers (2023-03-31T16:01:03Z) - SelfTune: Metrically Scaled Monocular Depth Estimation through
Self-Supervised Learning [53.78813049373321]
We propose a self-supervised learning method for the pre-trained supervised monocular depth networks to enable metrically scaled depth estimation.
Our approach is useful for various applications such as mobile robot navigation and is applicable to diverse environments.
arXiv Detail & Related papers (2022-03-10T12:28:42Z) - X-Distill: Improving Self-Supervised Monocular Depth via Cross-Task
Distillation [69.9604394044652]
We propose a novel method to improve the self-supervised training of monocular depth via cross-task knowledge distillation.
During training, we utilize a pretrained semantic segmentation teacher network and transfer its semantic knowledge to the depth network.
We extensively evaluate the efficacy of our proposed approach on the KITTI benchmark and compare it with the latest state of the art.
arXiv Detail & Related papers (2021-10-24T19:47:14Z) - Fine-grained Semantics-aware Representation Enhancement for
Self-supervised Monocular Depth Estimation [16.092527463250708]
We propose novel ideas to improve self-supervised monocular depth estimation.
We focus on incorporating implicit semantic knowledge into geometric representation enhancement.
We evaluate our methods on the KITTI dataset and demonstrate that our method outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-08-19T17:50:51Z) - Towards Interpretable Deep Networks for Monocular Depth Estimation [78.84690613778739]
We quantify the interpretability of a deep MDE network by the depth selectivity of its hidden units.
We propose a method to train interpretable MDE deep networks without changing their original architectures.
Experimental results demonstrate that our method is able to enhance the interpretability of deep MDE networks.
arXiv Detail & Related papers (2021-08-11T16:43:45Z) - SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from
Monocular images [94.36401543589523]
We introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks.
We then propose a Semantic Object and Depth Estimation Network (SOSD-Net) based on the objectness assumption.
To the best of our knowledge, SOSD-Net is the first network that exploits the geometry constraint for simultaneous monocular depth estimation and semantic segmentation.
arXiv Detail & Related papers (2021-01-19T02:41:03Z) - Semantic-Guided Representation Enhancement for Self-supervised Monocular
Trained Depth Estimation [39.845944724079814]
Self-supervised depth estimation has shown its great effectiveness in producing high quality depth maps given only image sequences as input.
However, its performance usually drops when estimating on border areas or objects with thin structures due to the limited depth representation ability.
We propose a semantic-guided depth representation enhancement method, which promotes both local and global depth feature representations.
arXiv Detail & Related papers (2020-12-15T02:24:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.