Towards Deeply Unified Depth-aware Panoptic Segmentation with
Bi-directional Guidance Learning
- URL: http://arxiv.org/abs/2307.14786v2
- Date: Mon, 14 Aug 2023 07:06:03 GMT
- Title: Towards Deeply Unified Depth-aware Panoptic Segmentation with
Bi-directional Guidance Learning
- Authors: Junwen He, Yifan Wang, Lijun Wang, Huchuan Lu, Jun-Yan He, Jin-Peng
Lan, Bin Luo, Yifeng Geng, Xuansong Xie
- Abstract summary: We propose a deeply unified framework for depth-aware panoptic segmentation.
We propose a bi-directional guidance learning approach to facilitate cross-task feature learning.
Our method sets the new state of the art for depth-aware panoptic segmentation on both Cityscapes-DVPS and SemKITTI-DVPS datasets.
- Score: 63.63516124646916
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Depth-aware panoptic segmentation is an emerging topic in computer vision
which combines semantic and geometric understanding for more robust scene
interpretation. Recent works pursue unified frameworks to tackle this challenge
but mostly still treat it as two individual learning tasks, which limits their
potential for exploring cross-domain information. We propose a deeply unified
framework for depth-aware panoptic segmentation, which performs joint
segmentation and depth estimation both in a per-segment manner with identical
object queries. To narrow the gap between the two tasks, we further design a
geometric query enhancement method, which is able to integrate scene geometry
into object queries using latent representations. In addition, we propose a
bi-directional guidance learning approach to facilitate cross-task feature
learning by taking advantage of their mutual relations. Our method sets the new
state of the art for depth-aware panoptic segmentation on both Cityscapes-DVPS
and SemKITTI-DVPS datasets. Moreover, our guidance learning approach is shown
to deliver performance improvement even under incomplete supervision labels.
Related papers
- SwinMTL: A Shared Architecture for Simultaneous Depth Estimation and Semantic Segmentation from Monocular Camera Images [4.269350826756809]
This research paper presents an innovative multi-task learning framework that allows concurrent depth estimation and semantic segmentation using a single camera.
The proposed approach is based on a shared encoder-decoder architecture, which integrates various techniques to improve the accuracy of the depth estimation and semantic segmentation task without compromising computational efficiency.
The framework is thoroughly evaluated on two datasets - the outdoor Cityscapes dataset and the indoor NYU Depth V2 dataset - and it outperforms existing state-of-the-art methods in both segmentation and depth estimation tasks.
arXiv Detail & Related papers (2024-03-15T20:04:27Z) - X-Distill: Improving Self-Supervised Monocular Depth via Cross-Task
Distillation [69.9604394044652]
We propose a novel method to improve the self-supervised training of monocular depth via cross-task knowledge distillation.
During training, we utilize a pretrained semantic segmentation teacher network and transfer its semantic knowledge to the depth network.
We extensively evaluate the efficacy of our proposed approach on the KITTI benchmark and compare it with the latest state of the art.
arXiv Detail & Related papers (2021-10-24T19:47:14Z) - Learning to Relate Depth and Semantics for Unsupervised Domain
Adaptation [87.1188556802942]
We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting.
We propose a novel Cross-Task Relation Layer (CTRL), which encodes task dependencies between the semantic and depth predictions.
Furthermore, we propose an Iterative Self-Learning (ISL) training scheme, which exploits semantic pseudo-labels to provide extra supervision on the target domain.
arXiv Detail & Related papers (2021-05-17T13:42:09Z) - SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from
Monocular images [94.36401543589523]
We introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks.
We then propose a Semantic Object and Depth Estimation Network (SOSD-Net) based on the objectness assumption.
To the best of our knowledge, SOSD-Net is the first network that exploits the geometry constraint for simultaneous monocular depth estimation and semantic segmentation.
arXiv Detail & Related papers (2021-01-19T02:41:03Z) - Three Ways to Improve Semantic Segmentation with Self-Supervised Depth
Estimation [90.87105131054419]
We present a framework for semi-supervised semantic segmentation, which is enhanced by self-supervised monocular depth estimation from unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset, where all three modules demonstrate significant performance gains.
arXiv Detail & Related papers (2020-12-19T21:18:03Z) - Bidirectional Graph Reasoning Network for Panoptic Segmentation [126.06251745669107]
We introduce a Bidirectional Graph Reasoning Network (BGRNet) to mine the intra-modular and intermodular relations within and between foreground things and background stuff classes.
BGRNet first constructs image-specific graphs in both instance and semantic segmentation branches that enable flexible reasoning at the proposal level and class level.
arXiv Detail & Related papers (2020-04-14T02:32:10Z) - The Edge of Depth: Explicit Constraints between Segmentation and Depth [25.232436455640716]
We study the mutual benefits of two common computer vision tasks, self-supervised depth estimation and semantic segmentation from images.
We propose to explicitly measure the border consistency between segmentation and depth and minimize it.
Through extensive experiments, our proposed approach advances the state of the art on unsupervised monocular depth estimation in the KITTI.
arXiv Detail & Related papers (2020-04-01T00:03:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.