CI-Net: Contextual Information for Joint Semantic Segmentation and Depth
Estimation
- URL: http://arxiv.org/abs/2107.13800v1
- Date: Thu, 29 Jul 2021 07:58:25 GMT
- Title: CI-Net: Contextual Information for Joint Semantic Segmentation and Depth
Estimation
- Authors: Tianxiao Gao, Wu Wei, Zhongbin Cai, Zhun Fan, Shane Xie, Xinmei Wang,
Qiuda Yu
- Abstract summary: We propose a network injected with contextual information (CI-Net) to solve the problem.
With supervision from semantic labels, the network is embedded with contextual information so that it could understand the scene better.
We evaluate the proposed CI-Net on the NYU-Depth-v2 and SUN-RGBD datasets.
- Score: 2.8785764686013837
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Monocular depth estimation and semantic segmentation are two fundamental
goals of scene understanding. Due to the advantages of task interaction, many
works study the joint task learning algorithm. However, most existing methods
fail to fully leverage the semantic labels, ignoring the provided context
structures and only using them to supervise the prediction of segmentation
split. In this paper, we propose a network injected with contextual information
(CI-Net) to solve the problem. Specifically, we introduce self-attention block
in the encoder to generate attention map. With supervision from the ground
truth created by semantic labels, the network is embedded with contextual
information so that it could understand the scene better, utilizing dependent
features to make accurate prediction. Besides, a feature sharing module is
constructed to make the task-specific features deeply fused and a consistency
loss is devised to make the features mutually guided. We evaluate the proposed
CI-Net on the NYU-Depth-v2 and SUN-RGBD datasets. The experimental results
validate that our proposed CI-Net is competitive with the state-of-the-arts.
Related papers
- Refining Segmentation On-the-Fly: An Interactive Framework for Point
Cloud Semantic Segmentation [9.832150567595718]
We present the first interactive framework for point cloud semantic segmentation, named InterPCSeg.
We develop an interaction simulation scheme tailored for the interactive point cloud semantic segmentation task.
We evaluate our framework on the S3DIS and ScanNet datasets with off-the-shelf segmentation networks.
arXiv Detail & Related papers (2024-03-11T03:24:58Z) - COMNet: Co-Occurrent Matching for Weakly Supervised Semantic
Segmentation [13.244183864948848]
We propose a novel Co-Occurrent Matching Network (COMNet), which can promote the quality of the CAMs and enforce the network to pay attention to the entire parts of objects.
Specifically, we perform inter-matching on paired images that contain common classes to enhance the corresponded areas, and construct intra-matching on a single image to propagate the semantic features across the object regions.
The experiments on the Pascal VOC 2012 and MS-COCO datasets show that our network can effectively boost the performance of the baseline model and achieve new state-of-the-art performance.
arXiv Detail & Related papers (2023-09-29T03:55:24Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - Improving Lidar-Based Semantic Segmentation of Top-View Grid Maps by
Learning Features in Complementary Representations [3.0413873719021995]
We introduce a novel way to predict semantic information from sparse, single-shot LiDAR measurements in the context of autonomous driving.
The approach is aimed specifically at improving the semantic segmentation of top-view grid maps.
For each representation a tailored deep learning architecture is developed to effectively extract semantic information.
arXiv Detail & Related papers (2022-03-02T14:49:51Z) - DIAL: Deep Interactive and Active Learning for Semantic Segmentation in
Remote Sensing [34.209686918341475]
We propose to build up a collaboration between a deep neural network and a human in the loop.
In a nutshell, the agent iteratively interacts with the network to correct its initially flawed predictions.
We show that active learning based on uncertainty estimation enables to quickly lead the user towards mistakes.
arXiv Detail & Related papers (2022-01-04T09:11:58Z) - Triggering Failures: Out-Of-Distribution detection by learning from
local adversarial attacks in Semantic Segmentation [76.2621758731288]
We tackle the detection of out-of-distribution (OOD) objects in semantic segmentation.
Our main contribution is a new OOD detection architecture called ObsNet associated with a dedicated training scheme based on Local Adversarial Attacks (LAA)
We show it obtains top performances both in speed and accuracy when compared to ten recent methods of the literature on three different datasets.
arXiv Detail & Related papers (2021-08-03T17:09:56Z) - Cross-modal Consensus Network for Weakly Supervised Temporal Action
Localization [74.34699679568818]
Weakly supervised temporal action localization (WS-TAL) is a challenging task that aims to localize action instances in the given video with video-level categorical supervision.
We propose a cross-modal consensus network (CO2-Net) to tackle this problem.
arXiv Detail & Related papers (2021-07-27T04:21:01Z) - CTNet: Context-based Tandem Network for Semantic Segmentation [77.4337867789772]
This work proposes a novel Context-based Tandem Network (CTNet) by interactively exploring the spatial contextual information and the channel contextual information.
To further improve the performance of the learned representations for semantic segmentation, the results of the two context modules are adaptively integrated.
arXiv Detail & Related papers (2021-04-20T07:33:11Z) - SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from
Monocular images [94.36401543589523]
We introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks.
We then propose a Semantic Object and Depth Estimation Network (SOSD-Net) based on the objectness assumption.
To the best of our knowledge, SOSD-Net is the first network that exploits the geometry constraint for simultaneous monocular depth estimation and semantic segmentation.
arXiv Detail & Related papers (2021-01-19T02:41:03Z) - Bidirectional Graph Reasoning Network for Panoptic Segmentation [126.06251745669107]
We introduce a Bidirectional Graph Reasoning Network (BGRNet) to mine the intra-modular and intermodular relations within and between foreground things and background stuff classes.
BGRNet first constructs image-specific graphs in both instance and semantic segmentation branches that enable flexible reasoning at the proposal level and class level.
arXiv Detail & Related papers (2020-04-14T02:32:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.