Few-shot semantic segmentation via mask aggregation
- URL: http://arxiv.org/abs/2202.07231v1
- Date: Tue, 15 Feb 2022 07:13:09 GMT
- Title: Few-shot semantic segmentation via mask aggregation
- Authors: Wei Ao, Shunyi Zheng, Yan Meng
- Abstract summary: Few-shot semantic segmentation aims to recognize novel classes with only very few labelled data.
Previous works have typically regarded it as a pixel-wise classification problem.
We introduce a mask-based classification method for addressing this problem.
- Score: 5.886986014593717
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot semantic segmentation aims to recognize novel classes with only very
few labelled data. This challenging task requires mining of the relevant
relationships between the query image and the support images. Previous works
have typically regarded it as a pixel-wise classification problem. Therefore,
various models have been designed to explore the correlation of pixels between
the query image and the support images. However, they focus only on pixel-wise
correspondence and ignore the overall correlation of objects. In this paper, we
introduce a mask-based classification method for addressing this problem. The
mask aggregation network (MANet), which is a simple mask classification model,
is proposed to simultaneously generate a fixed number of masks and their
probabilities of being targets. Then, the final segmentation result is obtained
by aggregating all the masks according to their locations. Experiments on both
the PASCAL-5^i and COCO-20^i datasets show that our method performs comparably
to the state-of-the-art pixel-based methods. This competitive performance
demonstrates the potential of mask classification as an alternative baseline
method in few-shot semantic segmentation. Our source code will be made
available at https://github.com/TinyAway/MANet.
Related papers
- Variance-insensitive and Target-preserving Mask Refinement for
Interactive Image Segmentation [68.16510297109872]
Point-based interactive image segmentation can ease the burden of mask annotation in applications such as semantic segmentation and image editing.
We introduce a novel method, Variance-Insensitive and Target-Preserving Mask Refinement to enhance segmentation quality with fewer user inputs.
Experiments on GrabCut, Berkeley, SBD, and DAVIS datasets demonstrate our method's state-of-the-art performance in interactive image segmentation.
arXiv Detail & Related papers (2023-12-22T02:31:31Z) - MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner
for Open-World Semantic Segmentation [110.09800389100599]
We propose MixReorg, a novel and straightforward pre-training paradigm for semantic segmentation.
Our approach involves generating fine-grained patch-text pairs data by mixing image patches while preserving the correspondence between patches and text.
With MixReorg as a mask learner, conventional text-supervised semantic segmentation models can achieve highly generalizable pixel-semantic alignment ability.
arXiv Detail & Related papers (2023-08-09T09:35:16Z) - High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation [17.804090651425955]
Image-level weakly-supervised segmentation (WSSS) reduces the usually vast data annotation cost by surrogate segmentation masks during training.
Our work is based on two techniques for improving CAMs; importance sampling, which is a substitute for GAP, and the feature similarity loss.
We reformulate both techniques based on binomial posteriors of multiple independent binary problems.
This has two benefits; their performance is improved and they become more general, resulting in an add-on method that can boost virtually any WSSS method.
arXiv Detail & Related papers (2023-04-05T17:43:57Z) - MaskRange: A Mask-classification Model for Range-view based LiDAR
Segmentation [34.04740351544143]
We propose a unified mask-classification model, MaskRange, for the range-view based LiDAR semantic and panoptic segmentation.
Our MaskRange achieves state-of-the-art performance with $66.10$ mIoU on semantic segmentation and promising results with $53.10$ PQ on panoptic segmentation with high efficiency.
arXiv Detail & Related papers (2022-06-24T04:39:49Z) - Discovering Object Masks with Transformers for Unsupervised Semantic
Segmentation [75.00151934315967]
MaskDistill is a novel framework for unsupervised semantic segmentation.
Our framework does not latch onto low-level image cues and is not limited to object-centric datasets.
arXiv Detail & Related papers (2022-06-13T17:59:43Z) - What You See is What You Classify: Black Box Attributions [61.998683569022006]
We train a deep network, the Explainer, to predict attributions for a pre-trained black-box classifier, the Explanandum.
Unlike most existing approaches, ours is capable of directly generating very distinct class-specific masks.
We show that our attributions are superior to established methods both visually and quantitatively.
arXiv Detail & Related papers (2022-05-23T12:30:04Z) - Semantic Segmentation In-the-Wild Without Seeing Any Segmentation
Examples [34.97652735163338]
We propose a novel approach for creating semantic segmentation masks for every object.
Our method takes as input the image-level labels of the class categories present in the image.
The output of this stage provides pixel-level pseudo-labels, instead of the manual pixel-level labels required by supervised methods.
arXiv Detail & Related papers (2021-12-06T17:32:38Z) - Masked-attention Mask Transformer for Universal Image Segmentation [180.73009259614494]
We present Masked-attention Mask Transformer (Mask2Former), a new architecture capable of addressing any image segmentation task (panoptic, instance or semantic)
Its key components include masked attention, which extracts localized features by constraining cross-attention within predicted mask regions.
In addition to reducing the research effort by at least three times, it outperforms the best specialized architectures by a significant margin on four popular datasets.
arXiv Detail & Related papers (2021-12-02T18:59:58Z) - Per-Pixel Classification is Not All You Need for Semantic Segmentation [184.2905747595058]
Mask classification is sufficiently general to solve both semantic- and instance-level segmentation tasks.
We propose MaskFormer, a simple mask classification model which predicts a set of binary masks.
Our method outperforms both current state-of-the-art semantic (55.6 mIoU on ADE20K) and panoptic segmentation (52.7 PQ on COCO) models.
arXiv Detail & Related papers (2021-07-13T17:59:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.