Mask Matching Transformer for Few-Shot Segmentation
- URL: http://arxiv.org/abs/2301.01208v1
- Date: Mon, 5 Dec 2022 11:00:32 GMT
- Title: Mask Matching Transformer for Few-Shot Segmentation
- Authors: Siyu Jiao, Gengwei Zhang, Shant Navasardyan, Ling Chen, Yao Zhao,
Yunchao Wei, Humphrey Shi
- Abstract summary: Mask Matching Transformer (MM-Former) is a new paradigm for the few-shot segmentation task.
First, the MM-Former follows the paradigm of decompose first and then blend, allowing our method to benefit from the advanced potential objects segmenter.
We conduct extensive experiments on the popular COCO-$20i$ and Pascal-$5i$ benchmarks.
- Score: 71.32725963630837
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we aim to tackle the challenging few-shot segmentation task
from a new perspective. Typical methods follow the paradigm to firstly learn
prototypical features from support images and then match query features in
pixel-level to obtain segmentation results. However, to obtain satisfactory
segments, such a paradigm needs to couple the learning of the matching
operations with heavy segmentation modules, limiting the flexibility of design
and increasing the learning complexity. To alleviate this issue, we propose
Mask Matching Transformer (MM-Former), a new paradigm for the few-shot
segmentation task. Specifically, MM-Former first uses a class-agnostic
segmenter to decompose the query image into multiple segment proposals. Then, a
simple matching mechanism is applied to merge the related segment proposals
into the final mask guided by the support images. The advantages of our
MM-Former are two-fold. First, the MM-Former follows the paradigm of decompose
first and then blend, allowing our method to benefit from the advanced
potential objects segmenter to produce high-quality mask proposals for query
images. Second, the mission of prototypical features is relaxed to learn
coefficients to fuse correct ones within a proposal pool, making the MM-Former
be well generalized to complex scenarios or cases. We conduct extensive
experiments on the popular COCO-$20^i$ and Pascal-$5^i$ benchmarks. Competitive
results well demonstrate the effectiveness and the generalization ability of
our MM-Former.
Related papers
- MaskUno: Switch-Split Block For Enhancing Instance Segmentation [0.0]
We propose replacing mask prediction with a Switch-Split block that processes refined ROIs, classifies them, and assigns them to specialized mask predictors.
An increase in the mean Average Precision (mAP) of 2.03% was observed for the high-performing DetectoRS when trained on 80 classes.
arXiv Detail & Related papers (2024-07-31T10:12:14Z) - Synthetic Instance Segmentation from Semantic Image Segmentation Masks [15.477053085267404]
We propose a novel paradigm called Synthetic Instance (SISeg)
SISeg instance segmentation results by leveraging image masks generated by existing semantic segmentation models.
In other words, the proposed model does not need extra manpower or higher computational expenses.
arXiv Detail & Related papers (2023-08-02T05:13:02Z) - Boosting Few-shot Semantic Segmentation with Transformers [81.43459055197435]
TRansformer-based Few-shot Semantic segmentation method (TRFS)
Our model consists of two modules: Global Enhancement Module (GEM) and Local Enhancement Module (LEM)
arXiv Detail & Related papers (2021-08-04T20:09:21Z) - Segmenter: Transformer for Semantic Segmentation [79.9887988699159]
We introduce Segmenter, a transformer model for semantic segmentation.
We build on the recent Vision Transformer (ViT) and extend it to semantic segmentation.
It outperforms the state of the art on the challenging ADE20K dataset and performs on-par on Pascal Context and Cityscapes.
arXiv Detail & Related papers (2021-05-12T13:01:44Z) - SCNet: Enhancing Few-Shot Semantic Segmentation by Self-Contrastive
Background Prototypes [56.387647750094466]
Few-shot semantic segmentation aims to segment novel-class objects in a query image with only a few annotated examples.
Most of advanced solutions exploit a metric learning framework that performs segmentation through matching each pixel to a learned foreground prototype.
This framework suffers from biased classification due to incomplete construction of sample pairs with the foreground prototype only.
arXiv Detail & Related papers (2021-04-19T11:21:47Z) - Deep Gaussian Processes for Few-Shot Segmentation [66.08463078545306]
Few-shot segmentation is a challenging task, requiring the extraction of a generalizable representation from only a few annotated samples.
We propose a few-shot learner formulation based on Gaussian process (GP) regression.
Our approach sets a new state-of-the-art for 5-shot segmentation, with mIoU scores of 68.1 and 49.8 on PASCAL-5i and COCO-20i, respectively.
arXiv Detail & Related papers (2021-03-30T17:56:32Z) - Proposal-Free Volumetric Instance Segmentation from Latent
Single-Instance Masks [16.217524435617744]
This work introduces a new proposal-free instance segmentation method.
It builds on single-instance segmentation masks predicted across the entire image in a sliding window style.
In contrast to related approaches, our method concurrently predicts all masks, one for each pixel, and thus resolves any conflict jointly across the entire image.
arXiv Detail & Related papers (2020-09-10T17:09:23Z) - Prototype Mixture Models for Few-shot Semantic Segmentation [50.866870384596446]
Few-shot segmentation is challenging because objects within the support and query images could significantly differ in appearance and pose.
We propose prototype mixture models (PMMs), which correlate diverse image regions with multiple prototypes to enforce the prototype-based semantic representation.
PMMs improve 5-shot segmentation performance on MS-COCO by up to 5.82% with only a moderate cost for model size and inference speed.
arXiv Detail & Related papers (2020-08-10T04:33:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.