Three Ways to Improve Semantic Segmentation with Self-Supervised Depth
Estimation
- URL: http://arxiv.org/abs/2012.10782v2
- Date: Mon, 5 Apr 2021 09:46:36 GMT
- Title: Three Ways to Improve Semantic Segmentation with Self-Supervised Depth
Estimation
- Authors: Lukas Hoyer, Dengxin Dai, Yuhua Chen, Adrian K\"oring, Suman Saha, Luc
Van Gool
- Abstract summary: We present a framework for semi-supervised semantic segmentation, which is enhanced by self-supervised monocular depth estimation from unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset, where all three modules demonstrate significant performance gains.
- Score: 90.87105131054419
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training deep networks for semantic segmentation requires large amounts of
labeled training data, which presents a major challenge in practice, as
labeling segmentation masks is a highly labor-intensive process. To address
this issue, we present a framework for semi-supervised semantic segmentation,
which is enhanced by self-supervised monocular depth estimation from unlabeled
image sequences. In particular, we propose three key contributions: (1) We
transfer knowledge from features learned during self-supervised depth
estimation to semantic segmentation, (2) we implement a strong data
augmentation by blending images and labels using the geometry of the scene, and
(3) we utilize the depth feature diversity as well as the level of difficulty
of learning depth in a student-teacher framework to select the most useful
samples to be annotated for semantic segmentation. We validate the proposed
model on the Cityscapes dataset, where all three modules demonstrate
significant performance gains, and we achieve state-of-the-art results for
semi-supervised semantic segmentation. The implementation is available at
https://github.com/lhoyer/improving_segmentation_with_selfsupervised_depth.
Related papers
- Unsupervised Semantic Segmentation Through Depth-Guided Feature Correlation and Sampling [14.88236554564287]
In this work, we build upon advances in unsupervised learning by incorporating information about the structure of a scene into the training process.
We achieve this by (1) learning depth-feature correlation by spatially correlate the feature maps with the depth maps to induce knowledge about the structure of the scene.
We then implement farthest-point sampling to more effectively select relevant features by utilizing 3D sampling techniques on depth information of the scene.
arXiv Detail & Related papers (2023-09-21T11:47:01Z) - AIMS: All-Inclusive Multi-Level Segmentation [93.5041381700744]
We propose a new task, All-Inclusive Multi-Level (AIMS), which segments visual regions into three levels: part, entity, and relation.
We also build a unified AIMS model through multi-dataset multi-task training to address the two major challenges of annotation inconsistency and task correlation.
arXiv Detail & Related papers (2023-05-28T16:28:49Z) - Open-world Semantic Segmentation via Contrasting and Clustering
Vision-Language Embedding [95.78002228538841]
We propose a new open-world semantic segmentation pipeline that makes the first attempt to learn to segment semantic objects of various open-world categories without any efforts on dense annotations.
Our method can directly segment objects of arbitrary categories, outperforming zero-shot segmentation methods that require data labeling on three benchmark datasets.
arXiv Detail & Related papers (2022-07-18T09:20:04Z) - Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of
Semantics and Depth [83.94528876742096]
We tackle the MTL problem of two dense tasks, ie, semantic segmentation and depth estimation, and present a novel attention module called Cross-Channel Attention Module (CCAM)
In a true symbiotic spirit, we then formulate a novel data augmentation for the semantic segmentation task using predicted depth called AffineMix, and a simple depth augmentation using predicted semantics called ColorAug.
Finally, we validate the performance gain of the proposed method on the Cityscapes dataset, which helps us achieve state-of-the-art results for a semi-supervised joint model based on depth and semantic
arXiv Detail & Related papers (2022-06-21T17:40:55Z) - X-Distill: Improving Self-Supervised Monocular Depth via Cross-Task
Distillation [69.9604394044652]
We propose a novel method to improve the self-supervised training of monocular depth via cross-task knowledge distillation.
During training, we utilize a pretrained semantic segmentation teacher network and transfer its semantic knowledge to the depth network.
We extensively evaluate the efficacy of our proposed approach on the KITTI benchmark and compare it with the latest state of the art.
arXiv Detail & Related papers (2021-10-24T19:47:14Z) - Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with
Self-Supervised Depth Estimation [94.16816278191477]
We present a framework for semi-adaptive and domain-supervised semantic segmentation.
It is enhanced by self-supervised monocular depth estimation trained only on unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset.
arXiv Detail & Related papers (2021-08-28T01:33:38Z) - Point-supervised Segmentation of Microscopy Images and Volumes via
Objectness Regularization [2.243486411968779]
This work enables the training of semantic segmentation networks on images with only a single point for training per instance.
We achieve competitive results against the state-of-the-art in point-supervised semantic segmentation on challenging datasets in digital pathology.
arXiv Detail & Related papers (2021-03-09T18:40:00Z) - A Three-Stage Self-Training Framework for Semi-Supervised Semantic
Segmentation [0.9786690381850356]
We propose a holistic solution framed as a three-stage self-training framework for semantic segmentation.
The key idea of our technique is the extraction of the pseudo-masks statistical information.
We then decrease the uncertainty of the pseudo-masks using a multi-task model that enforces consistency.
arXiv Detail & Related papers (2020-12-01T21:00:27Z) - Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images [24.216869988183092]
We propose a shapeaware semi-supervised segmentation strategy to leverage abundant unlabeled data and to enforce a geometric shape constraint on the segmentation output.
We develop a multi-task deep network that jointly predicts semantic segmentation and signed distance mapDM) of object surfaces.
Experiments show that our method outperforms current state-of-the-art approaches with improved shape estimation.
arXiv Detail & Related papers (2020-07-21T11:44:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.