Masked Supervised Learning for Semantic Segmentation
- URL: http://arxiv.org/abs/2210.00923v1
- Date: Mon, 3 Oct 2022 13:30:19 GMT
- Title: Masked Supervised Learning for Semantic Segmentation
- Authors: H. Zunair and A. Ben Hamza
- Abstract summary: Masked Supervised Learning (MaskSup) is an effective single-stage learning paradigm that models both short- and long-range context.
We show that the proposed method is computationally efficient, yielding an improved performance by 10% on the mean intersection-over-union (mIoU)
- Score: 5.177947445379688
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Self-attention is of vital importance in semantic segmentation as it enables
modeling of long-range context, which translates into improved performance. We
argue that it is equally important to model short-range context, especially to
tackle cases where not only the regions of interest are small and ambiguous,
but also when there exists an imbalance between the semantic classes. To this
end, we propose Masked Supervised Learning (MaskSup), an effective single-stage
learning paradigm that models both short- and long-range context, capturing the
contextual relationships between pixels via random masking. Experimental
results demonstrate the competitive performance of MaskSup against strong
baselines in both binary and multi-class segmentation tasks on three standard
benchmark datasets, particularly at handling ambiguous regions and retaining
better segmentation of minority classes with no added inference cost. In
addition to segmenting target regions even when large portions of the input are
masked, MaskSup is also generic and can be easily integrated into a variety of
semantic segmentation methods. We also show that the proposed method is
computationally efficient, yielding an improved performance by 10\% on the mean
intersection-over-union (mIoU) while requiring $3\times$ less learnable
parameters.
Related papers
- Effective SAM Combination for Open-Vocabulary Semantic Segmentation [24.126307031048203]
Open-vocabulary semantic segmentation aims to assign pixel-level labels to images across an unlimited range of classes.
ESC-Net is a novel one-stage open-vocabulary segmentation model that leverages the SAM decoder blocks for class-agnostic segmentation.
ESC-Net achieves superior performance on standard benchmarks, including ADE20K, PASCAL-VOC, and PASCAL-Context.
arXiv Detail & Related papers (2024-11-22T04:36:12Z) - Synthetic Instance Segmentation from Semantic Image Segmentation Masks [15.477053085267404]
We propose a novel paradigm called Synthetic Instance (SISeg)
SISeg instance segmentation results by leveraging image masks generated by existing semantic segmentation models.
In other words, the proposed model does not need extra manpower or higher computational expenses.
arXiv Detail & Related papers (2023-08-02T05:13:02Z) - Learning Context-aware Classifier for Semantic Segmentation [88.88198210948426]
In this paper, contextual hints are exploited via learning a context-aware classifier.
Our method is model-agnostic and can be easily applied to generic segmentation models.
With only negligible additional parameters and +2% inference time, decent performance gain has been achieved on both small and large models.
arXiv Detail & Related papers (2023-03-21T07:00:35Z) - Discovering Object Masks with Transformers for Unsupervised Semantic
Segmentation [75.00151934315967]
MaskDistill is a novel framework for unsupervised semantic segmentation.
Our framework does not latch onto low-level image cues and is not limited to object-centric datasets.
arXiv Detail & Related papers (2022-06-13T17:59:43Z) - Per-Pixel Classification is Not All You Need for Semantic Segmentation [184.2905747595058]
Mask classification is sufficiently general to solve both semantic- and instance-level segmentation tasks.
We propose MaskFormer, a simple mask classification model which predicts a set of binary masks.
Our method outperforms both current state-of-the-art semantic (55.6 mIoU on ADE20K) and panoptic segmentation (52.7 PQ on COCO) models.
arXiv Detail & Related papers (2021-07-13T17:59:50Z) - PointFlow: Flowing Semantics Through Points for Aerial Image
Segmentation [96.76882806139251]
We propose a point-wise affinity propagation module based on the Feature Pyramid Network (FPN) framework, named PointFlow.
Rather than dense affinity learning, a sparse affinity map is generated upon selected points between the adjacent features.
Experimental results on three different aerial segmentation datasets suggest that the proposed method is more effective and efficient than state-of-the-art general semantic segmentation methods.
arXiv Detail & Related papers (2021-03-11T09:42:32Z) - Three Ways to Improve Semantic Segmentation with Self-Supervised Depth
Estimation [90.87105131054419]
We present a framework for semi-supervised semantic segmentation, which is enhanced by self-supervised monocular depth estimation from unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset, where all three modules demonstrate significant performance gains.
arXiv Detail & Related papers (2020-12-19T21:18:03Z) - A Three-Stage Self-Training Framework for Semi-Supervised Semantic
Segmentation [0.9786690381850356]
We propose a holistic solution framed as a three-stage self-training framework for semantic segmentation.
The key idea of our technique is the extraction of the pseudo-masks statistical information.
We then decrease the uncertainty of the pseudo-masks using a multi-task model that enforces consistency.
arXiv Detail & Related papers (2020-12-01T21:00:27Z) - Commonality-Parsing Network across Shape and Appearance for Partially
Supervised Instance Segmentation [71.59275788106622]
We propose to learn the underlying class-agnostic commonalities that can be generalized from mask-annotated categories to novel categories.
Our model significantly outperforms the state-of-the-art methods on both partially supervised setting and few-shot setting for instance segmentation on COCO dataset.
arXiv Detail & Related papers (2020-07-24T07:23:44Z) - Lookahead Adversarial Learning for Near Real-Time Semantic Segmentation [2.538209532048867]
We build a conditional adversarial network with a state-of-the-art segmentation model (DeepLabv3+) at its core.
We focus on semantic segmentation models that run fast at inference for near real-time field applications.
arXiv Detail & Related papers (2020-06-19T17:04:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.