TriangleNet: Edge Prior Augmented Network for Semantic Segmentation
through Cross-Task Consistency
- URL: http://arxiv.org/abs/2210.05152v5
- Date: Wed, 30 Aug 2023 14:24:46 GMT
- Title: TriangleNet: Edge Prior Augmented Network for Semantic Segmentation
through Cross-Task Consistency
- Authors: Dan Zhang, Rui Zheng, Luosang Gadeng, Pei Yang
- Abstract summary: This paper addresses the task of semantic segmentation in computer vision, aiming to achieve precise pixel-wise classification.
We propose a novel "decoupled cross-task consistency loss" that explicitly enhances cross-task consistency.
Our semantic segmentation network, TriangleNet, achieves a substantial 2.88% improvement over the Baseline in mean Intersection over Union (mIoU) on the Cityscapes test set.
- Score: 10.92477003580794
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the task of semantic segmentation in computer vision,
aiming to achieve precise pixel-wise classification. We investigate the joint
training of models for semantic edge detection and semantic segmentation, which
has shown promise. However, implicit cross-task consistency learning in
multi-task networks is limited. To address this, we propose a novel "decoupled
cross-task consistency loss" that explicitly enhances cross-task consistency.
Our semantic segmentation network, TriangleNet, achieves a substantial 2.88\%
improvement over the Baseline in mean Intersection over Union (mIoU) on the
Cityscapes test set. Notably, TriangleNet operates at 77.4\% mIoU/46.2 FPS on
Cityscapes, showcasing real-time inference capabilities at full resolution.
With multi-scale inference, performance is further enhanced to 77.8\%.
Furthermore, TriangleNet consistently outperforms the Baseline on the FloodNet
dataset, demonstrating its robust generalization capabilities. The proposed
method underscores the significance of multi-task learning and explicit
cross-task consistency enhancement for advancing semantic segmentation and
highlights the potential of multitasking in real-time semantic segmentation.
Related papers
- A Point-Based Approach to Efficient LiDAR Multi-Task Perception [49.91741677556553]
PAttFormer is an efficient multi-task architecture for joint semantic segmentation and object detection in point clouds.
Unlike other LiDAR-based multi-task architectures, our proposed PAttFormer does not require separate feature encoders for task-specific point cloud representations.
Our evaluations show substantial gains from multi-task learning, improving LiDAR semantic segmentation by +1.7% in mIou and 3D object detection by +1.7% in mAP.
arXiv Detail & Related papers (2024-04-19T11:24:34Z) - The revenge of BiSeNet: Efficient Multi-Task Image Segmentation [6.172605433695617]
BiSeNetFormer is a novel architecture for efficient multi-task image segmentation.
By seamlessly supporting multiple tasks, BiSeNetFormer offers a versatile solution for multi-task segmentation.
Our results indicate that BiSeNetFormer represents a significant advancement towards fast, efficient, and multi-task segmentation networks.
arXiv Detail & Related papers (2024-04-15T08:32:18Z) - HS3: Learning with Proper Task Complexity in Hierarchically Supervised
Semantic Segmentation [81.87943324048756]
We propose Hierarchically Supervised Semantic (HS3), a training scheme that supervises intermediate layers in a segmentation network to learn meaningful representations by varying task complexity.
Our proposed HS3-Fuse framework further improves segmentation predictions and achieves state-of-the-art results on two large segmentation benchmarks: NYUD-v2 and Cityscapes.
arXiv Detail & Related papers (2021-11-03T16:33:29Z) - Leveraging Auxiliary Tasks with Affinity Learning for Weakly Supervised
Semantic Segmentation [88.49669148290306]
We propose a novel weakly supervised multi-task framework called AuxSegNet to leverage saliency detection and multi-label image classification as auxiliary tasks.
Inspired by their similar structured semantics, we also propose to learn a cross-task global pixel-level affinity map from the saliency and segmentation representations.
The learned cross-task affinity can be used to refine saliency predictions and propagate CAM maps to provide improved pseudo labels for both tasks.
arXiv Detail & Related papers (2021-07-25T11:39:58Z) - Empirical Study of Multi-Task Hourglass Model for Semantic Segmentation
Task [0.7614628596146599]
We propose to use a multi-task approach by complementing the semantic segmentation task with edge detection, semantic contour, and distance transform tasks.
We demonstrate the effectiveness of learning in a multi-task setting for hourglass models in the Cityscapes, CamVid, and Freiburg Forest datasets.
arXiv Detail & Related papers (2021-05-28T01:08:10Z) - Learning to Relate Depth and Semantics for Unsupervised Domain
Adaptation [87.1188556802942]
We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting.
We propose a novel Cross-Task Relation Layer (CTRL), which encodes task dependencies between the semantic and depth predictions.
Furthermore, we propose an Iterative Self-Learning (ISL) training scheme, which exploits semantic pseudo-labels to provide extra supervision on the target domain.
arXiv Detail & Related papers (2021-05-17T13:42:09Z) - SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from
Monocular images [94.36401543589523]
We introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks.
We then propose a Semantic Object and Depth Estimation Network (SOSD-Net) based on the objectness assumption.
To the best of our knowledge, SOSD-Net is the first network that exploits the geometry constraint for simultaneous monocular depth estimation and semantic segmentation.
arXiv Detail & Related papers (2021-01-19T02:41:03Z) - Improving Point Cloud Semantic Segmentation by Learning 3D Object
Detection [102.62963605429508]
Point cloud semantic segmentation plays an essential role in autonomous driving.
Current 3D semantic segmentation networks focus on convolutional architectures that perform great for well represented classes.
We propose a novel Aware 3D Semantic Detection (DASS) framework that explicitly leverages localization features from an auxiliary 3D object detection task.
arXiv Detail & Related papers (2020-09-22T14:17:40Z) - JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D
Point Clouds [37.703770427574476]
In this paper, we tackle the 3D semantic edge detection task for the first time.
We present a new two-stream fully-convolutional network that jointly performs the two tasks.
In particular, we design a joint refinement module that explicitly wires region information and edge information to improve the performances of both tasks.
arXiv Detail & Related papers (2020-07-14T08:00:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.