Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of
Semantics and Depth
- URL: http://arxiv.org/abs/2206.10562v1
- Date: Tue, 21 Jun 2022 17:40:55 GMT
- Title: Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of
Semantics and Depth
- Authors: Nitin Bansal, Pan Ji, Junsong Yuan, Yi Xu
- Abstract summary: We tackle the MTL problem of two dense tasks, ie, semantic segmentation and depth estimation, and present a novel attention module called Cross-Channel Attention Module (CCAM)
In a true symbiotic spirit, we then formulate a novel data augmentation for the semantic segmentation task using predicted depth called AffineMix, and a simple depth augmentation using predicted semantics called ColorAug.
Finally, we validate the performance gain of the proposed method on the Cityscapes dataset, which helps us achieve state-of-the-art results for a semi-supervised joint model based on depth and semantic
- Score: 83.94528876742096
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-task learning (MTL) paradigm focuses on jointly learning two or more
tasks, aiming for significant improvement w.r.t model's generalizability,
performance, and training/inference memory footprint. The aforementioned
benefits become ever so indispensable in the case of joint training for
vision-related {\bf dense} prediction tasks. In this work, we tackle the MTL
problem of two dense tasks, \ie, semantic segmentation and depth estimation,
and present a novel attention module called Cross-Channel Attention Module
({CCAM}), which facilitates effective feature sharing along each channel
between the two tasks, leading to mutual performance gain with a negligible
increase in trainable parameters. In a true symbiotic spirit, we then formulate
a novel data augmentation for the semantic segmentation task using predicted
depth called {AffineMix}, and a simple depth augmentation using predicted
semantics called {ColorAug}. Finally, we validate the performance gain of the
proposed method on the Cityscapes dataset, which helps us achieve
state-of-the-art results for a semi-supervised joint model based on depth and
semantic segmentation.
Related papers
- Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised
Semantic Segmentation [79.05949524349005]
We propose AuxSegNet+, a weakly supervised auxiliary learning framework to explore the rich information from saliency maps.
We also propose a cross-task affinity learning mechanism to learn pixel-level affinities from the saliency and segmentation feature maps.
arXiv Detail & Related papers (2024-03-02T10:03:21Z) - Harnessing Diffusion Models for Visual Perception with Meta Prompts [68.78938846041767]
We propose a simple yet effective scheme to harness a diffusion model for visual perception tasks.
We introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception.
Our approach achieves new performance records in depth estimation tasks on NYU depth V2 and KITTI, and in semantic segmentation task on CityScapes.
arXiv Detail & Related papers (2023-12-22T14:40:55Z) - IDEAL: Improved DEnse locAL Contrastive Learning for Semi-Supervised
Medical Image Segmentation [3.6748639131154315]
We extend the concept of metric learning to the segmentation task.
We propose a simple convolutional projection head for obtaining dense pixel-level features.
A bidirectional regularization mechanism involving two-stream regularization training is devised for the downstream task.
arXiv Detail & Related papers (2022-10-26T23:11:02Z) - Learning to Relate Depth and Semantics for Unsupervised Domain
Adaptation [87.1188556802942]
We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting.
We propose a novel Cross-Task Relation Layer (CTRL), which encodes task dependencies between the semantic and depth predictions.
Furthermore, we propose an Iterative Self-Learning (ISL) training scheme, which exploits semantic pseudo-labels to provide extra supervision on the target domain.
arXiv Detail & Related papers (2021-05-17T13:42:09Z) - SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from
Monocular images [94.36401543589523]
We introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks.
We then propose a Semantic Object and Depth Estimation Network (SOSD-Net) based on the objectness assumption.
To the best of our knowledge, SOSD-Net is the first network that exploits the geometry constraint for simultaneous monocular depth estimation and semantic segmentation.
arXiv Detail & Related papers (2021-01-19T02:41:03Z) - Three Ways to Improve Semantic Segmentation with Self-Supervised Depth
Estimation [90.87105131054419]
We present a framework for semi-supervised semantic segmentation, which is enhanced by self-supervised monocular depth estimation from unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset, where all three modules demonstrate significant performance gains.
arXiv Detail & Related papers (2020-12-19T21:18:03Z) - An unsupervised deep learning framework via integrated optimization of
representation learning and GMM-based modeling [31.334196673143257]
This paper introduces a new principle of joint learning on both deep representations and GMM-based deep modeling.
In comparison with the existing work in similar areas, our objective function has two learning targets, which are created to be jointly optimized.
The compactness of clusters is significantly enhanced by reducing the intra-cluster distances, and the separability is improved by increasing the inter-cluster distances.
arXiv Detail & Related papers (2020-09-11T04:57:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.