SemHint-MD: Learning from Noisy Semantic Labels for Self-Supervised
Monocular Depth Estimation
- URL: http://arxiv.org/abs/2303.18219v1
- Date: Fri, 31 Mar 2023 17:20:27 GMT
- Title: SemHint-MD: Learning from Noisy Semantic Labels for Self-Supervised
Monocular Depth Estimation
- Authors: Shan Lin, Yuheng Zhi, and Michael C. Yip
- Abstract summary: Self-supervised depth estimation can be trapped in a local minimum due to the gradient-locality issue of the photometric loss.
We present a framework to enhance depth by leveraging semantic segmentation to guide the network to jump out of the local minimum.
- Score: 19.229255297016635
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Without ground truth supervision, self-supervised depth estimation can be
trapped in a local minimum due to the gradient-locality issue of the
photometric loss. In this paper, we present a framework to enhance depth by
leveraging semantic segmentation to guide the network to jump out of the local
minimum. Prior works have proposed to share encoders between these two tasks or
explicitly align them based on priors like the consistency between edges in the
depth and segmentation maps. Yet, these methods usually require ground truth or
high-quality pseudo labels, which may not be easily accessible in real-world
applications. In contrast, we investigate self-supervised depth estimation
along with a segmentation branch that is supervised with noisy labels provided
by models pre-trained with limited data. We extend parameter sharing from the
encoder to the decoder and study the influence of different numbers of shared
decoder parameters on model performance. Also, we propose to use cross-task
information to refine current depth and segmentation predictions to generate
pseudo-depth and semantic labels for training. The advantages of the proposed
method are demonstrated through extensive experiments on the KITTI benchmark
and a downstream task for endoscopic tissue deformation tracking.
Related papers
- Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised
Semantic Segmentation [79.05949524349005]
We propose AuxSegNet+, a weakly supervised auxiliary learning framework to explore the rich information from saliency maps.
We also propose a cross-task affinity learning mechanism to learn pixel-level affinities from the saliency and segmentation feature maps.
arXiv Detail & Related papers (2024-03-02T10:03:21Z) - GAM-Depth: Self-Supervised Indoor Depth Estimation Leveraging a
Gradient-Aware Mask and Semantic Constraints [12.426365333096264]
We propose GAM-Depth, developed upon two novel components: gradient-aware mask and semantic constraints.
The gradient-aware mask enables adaptive and robust supervision for both key areas and textureless regions.
The incorporation of semantic constraints for indoor self-supervised depth estimation improves depth discrepancies at object boundaries.
arXiv Detail & Related papers (2024-02-22T07:53:34Z) - Transferring to Real-World Layouts: A Depth-aware Framework for Scene Adaptation [34.786268652516355]
Scene segmentation via unsupervised domain adaptation (UDA) enables the transfer of knowledge acquired from source synthetic data to real-world target data.
We propose a depth-aware framework to explicitly leverage depth estimation to mix the categories and facilitate the two complementary tasks, i.e., segmentation and depth learning.
In particular, the framework contains a Depth-guided Contextual Filter (DCF) forndata augmentation and a cross-task encoder for contextual learning.
arXiv Detail & Related papers (2023-11-21T15:39:21Z) - LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds [62.49198183539889]
We propose a label-efficient semantic segmentation pipeline for outdoor scenes with LiDAR point clouds.
Our method co-designs an efficient labeling process with semi/weakly supervised learning.
Our proposed method is even highly competitive compared to the fully supervised counterpart with 100% labels.
arXiv Detail & Related papers (2022-10-14T19:13:36Z) - X-Distill: Improving Self-Supervised Monocular Depth via Cross-Task
Distillation [69.9604394044652]
We propose a novel method to improve the self-supervised training of monocular depth via cross-task knowledge distillation.
During training, we utilize a pretrained semantic segmentation teacher network and transfer its semantic knowledge to the depth network.
We extensively evaluate the efficacy of our proposed approach on the KITTI benchmark and compare it with the latest state of the art.
arXiv Detail & Related papers (2021-10-24T19:47:14Z) - Domain Adaptive Semantic Segmentation with Self-Supervised Depth
Estimation [84.34227665232281]
Domain adaptation for semantic segmentation aims to improve the model performance in the presence of a distribution shift between source and target domain.
We leverage the guidance from self-supervised depth estimation, which is available on both domains, to bridge the domain gap.
We demonstrate the effectiveness of our proposed approach on the benchmark tasks SYNTHIA-to-Cityscapes and GTA-to-Cityscapes.
arXiv Detail & Related papers (2021-04-28T07:47:36Z) - Three Ways to Improve Semantic Segmentation with Self-Supervised Depth
Estimation [90.87105131054419]
We present a framework for semi-supervised semantic segmentation, which is enhanced by self-supervised monocular depth estimation from unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset, where all three modules demonstrate significant performance gains.
arXiv Detail & Related papers (2020-12-19T21:18:03Z) - Deep Semi-supervised Knowledge Distillation for Overlapping Cervical
Cell Instance Segmentation [54.49894381464853]
We propose to leverage both labeled and unlabeled data for instance segmentation with improved accuracy by knowledge distillation.
We propose a novel Mask-guided Mean Teacher framework with Perturbation-sensitive Sample Mining.
Experiments show that the proposed method improves the performance significantly compared with the supervised method learned from labeled data only.
arXiv Detail & Related papers (2020-07-21T13:27:09Z) - Semantics-Driven Unsupervised Learning for Monocular Depth and
Ego-Motion Estimation [33.83396613039467]
We propose a semantics-driven unsupervised learning approach for monocular depth and ego-motion estimation from videos.
Recent unsupervised learning methods employ photometric errors between synthetic view and actual image as a supervision signal for training.
arXiv Detail & Related papers (2020-06-08T05:55:07Z) - Semi-Supervised Semantic Segmentation with Cross-Consistency Training [8.894935073145252]
We present a novel cross-consistency based semi-supervised approach for semantic segmentation.
Our method achieves state-of-the-art results in several datasets.
arXiv Detail & Related papers (2020-03-19T20:10:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.