Multi-task GANs for Semantic Segmentation and Depth Completion with
Cycle Consistency
- URL: http://arxiv.org/abs/2011.14272v1
- Date: Sun, 29 Nov 2020 04:12:16 GMT
- Title: Multi-task GANs for Semantic Segmentation and Depth Completion with
Cycle Consistency
- Authors: Chongzhen Zhang, Yang Tang, Chaoqiang Zhao, Qiyu Sun, Zhencheng Ye and
J\"urgen Kurths
- Abstract summary: We propose multi-task generative adversarial networks (Multi-task GANs), which are competent in semantic segmentation and depth completion.
In this paper, we improve the details of generated semantic images based on CycleGAN by introducing multi-scale spatial pooling blocks and the structural similarity reconstruction loss.
Experiments on Cityscapes dataset and KITTI depth completion benchmark show that the Multi-task GANs are capable of achieving competitive performance.
- Score: 7.273142068778457
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semantic segmentation and depth completion are two challenging tasks in scene
understanding, and they are widely used in robotics and autonomous driving.
Although several works are proposed to jointly train these two tasks using some
small modifications, like changing the last layer, the result of one task is
not utilized to improve the performance of the other one despite that there are
some similarities between these two tasks. In this paper, we propose multi-task
generative adversarial networks (Multi-task GANs), which are not only competent
in semantic segmentation and depth completion, but also improve the accuracy of
depth completion through generated semantic images. In addition, we improve the
details of generated semantic images based on CycleGAN by introducing
multi-scale spatial pooling blocks and the structural similarity reconstruction
loss. Furthermore, considering the inner consistency between semantic and
geometric structures, we develop a semantic-guided smoothness loss to improve
depth completion results. Extensive experiments on Cityscapes dataset and KITTI
depth completion benchmark show that the Multi-task GANs are capable of
achieving competitive performance for both semantic segmentation and depth
completion tasks.
Related papers
- Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting [49.87694319431288]
Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources.
We propose a Comprehensive Generative (CGR) framework that restores appearance and semantic knowledge by synthesizing image-mask pairs.
Experiments on incremental tasks (cardiac, fundus and prostate segmentation) show its clear advantage for alleviating concurrent appearance and semantic forgetting.
arXiv Detail & Related papers (2024-06-28T10:05:58Z) - Multi-task Learning with 3D-Aware Regularization [55.97507478913053]
We propose a structured 3D-aware regularizer which interfaces multiple tasks through the projection of features extracted from an image encoder to a shared 3D feature space.
We show that the proposed method is architecture agnostic and can be plugged into various prior multi-task backbones to improve their performance.
arXiv Detail & Related papers (2023-10-02T08:49:56Z) - A Dynamic Feature Interaction Framework for Multi-task Visual Perception [100.98434079696268]
We devise an efficient unified framework to solve multiple common perception tasks.
These tasks include instance segmentation, semantic segmentation, monocular 3D detection, and depth estimation.
Our proposed framework, termed D2BNet, demonstrates a unique approach to parameter-efficient predictions for multi-task perception.
arXiv Detail & Related papers (2023-06-08T09:24:46Z) - SemSegDepth: A Combined Model for Semantic Segmentation and Depth
Completion [18.19171031755595]
We propose a new end-to-end model for performing semantic segmentation and depth completion jointly.
Our approach relies on RGB and sparse depth as inputs to our model and produces a dense depth map and the corresponding semantic segmentation image.
Experiments done on Virtual KITTI 2 dataset, demonstrate and provide further evidence, that combining both tasks, semantic segmentation and depth completion, in a multi-task network can effectively improve the performance of each task.
arXiv Detail & Related papers (2022-09-01T11:52:11Z) - Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of
Semantics and Depth [83.94528876742096]
We tackle the MTL problem of two dense tasks, ie, semantic segmentation and depth estimation, and present a novel attention module called Cross-Channel Attention Module (CCAM)
In a true symbiotic spirit, we then formulate a novel data augmentation for the semantic segmentation task using predicted depth called AffineMix, and a simple depth augmentation using predicted semantics called ColorAug.
Finally, we validate the performance gain of the proposed method on the Cityscapes dataset, which helps us achieve state-of-the-art results for a semi-supervised joint model based on depth and semantic
arXiv Detail & Related papers (2022-06-21T17:40:55Z) - Empirical Study of Multi-Task Hourglass Model for Semantic Segmentation
Task [0.7614628596146599]
We propose to use a multi-task approach by complementing the semantic segmentation task with edge detection, semantic contour, and distance transform tasks.
We demonstrate the effectiveness of learning in a multi-task setting for hourglass models in the Cityscapes, CamVid, and Freiburg Forest datasets.
arXiv Detail & Related papers (2021-05-28T01:08:10Z) - Learning to Relate Depth and Semantics for Unsupervised Domain
Adaptation [87.1188556802942]
We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting.
We propose a novel Cross-Task Relation Layer (CTRL), which encodes task dependencies between the semantic and depth predictions.
Furthermore, we propose an Iterative Self-Learning (ISL) training scheme, which exploits semantic pseudo-labels to provide extra supervision on the target domain.
arXiv Detail & Related papers (2021-05-17T13:42:09Z) - A Multi-Task Deep Learning Framework for Building Footprint Segmentation [0.0]
We propose a joint optimization scheme for the task of building footprint delineation.
We also introduce two auxiliary tasks; image reconstruction and building footprint boundary segmentation.
In particular, we propose a deep multi-task learning (MTL) based unified fully convolutional framework.
arXiv Detail & Related papers (2021-04-19T15:07:27Z) - SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from
Monocular images [94.36401543589523]
We introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks.
We then propose a Semantic Object and Depth Estimation Network (SOSD-Net) based on the objectness assumption.
To the best of our knowledge, SOSD-Net is the first network that exploits the geometry constraint for simultaneous monocular depth estimation and semantic segmentation.
arXiv Detail & Related papers (2021-01-19T02:41:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.