Prototype as Query for Few Shot Semantic Segmentation
- URL: http://arxiv.org/abs/2211.14764v1
- Date: Sun, 27 Nov 2022 08:41:50 GMT
- Title: Prototype as Query for Few Shot Semantic Segmentation
- Authors: Leilei Cao, Yibo Guo, Ye Yuan and Qiangguo Jin
- Abstract summary: Few-shot Semantic (FSS) was proposed to segment unseen classes in a query image, referring to only a few examples named support images.
We propose a framework built upon Transformer termed as ProtoFormer to fully capture spatial details in query features.
- Score: 7.380266341356485
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot Semantic Segmentation (FSS) was proposed to segment unseen classes
in a query image, referring to only a few annotated examples named support
images. One of the characteristics of FSS is spatial inconsistency between
query and support targets, e.g., texture or appearance. This greatly challenges
the generalization ability of methods for FSS, which requires to effectively
exploit the dependency of the query image and the support examples. Most
existing methods abstracted support features into prototype vectors and
implemented the interaction with query features using cosine similarity or
feature concatenation. However, this simple interaction may not capture spatial
details in query features. To alleviate this limitation, a few methods utilized
all pixel-wise support information via computing the pixel-wise correlations
between paired query and support features implemented with the attention
mechanism of Transformer. These approaches suffer from heavy computation on the
dot-product attention between all pixels of support and query features. In this
paper, we propose a simple yet effective framework built upon Transformer
termed as ProtoFormer to fully capture spatial details in query features. It
views the abstracted prototype of the target class in support features as Query
and the query features as Key and Value embeddings, which are input to the
Transformer decoder. In this way, the spatial details can be better captured
and the semantic features of target class in the query image can be focused.
The output of the Transformer-based module can be viewed as semantic-aware
dynamic kernels to filter out the segmentation mask from the enriched query
features. Extensive experiments on PASCAL-$5^{i}$ and COCO-$20^{i}$ show that
our ProtoFormer significantly advances the state-of-the-art methods.
Related papers
- Holistic Prototype Attention Network for Few-Shot VOS [74.25124421163542]
Few-shot video object segmentation (FSVOS) aims to segment dynamic objects of unseen classes by resorting to a small set of support images.
We propose a holistic prototype attention network (HPAN) for advancing FSVOS.
arXiv Detail & Related papers (2023-07-16T03:48:57Z) - Few-shot Medical Image Segmentation via Cross-Reference Transformer [3.2634122554914]
Few-shot segmentation(FSS) has the potential to address these challenges by learning new categories from a small number of labeled samples.
We propose a novel self-supervised few shot medical image segmentation network with Cross-Reference Transformer.
Experimental results show that the proposed model achieves good results on both CT dataset and MRI dataset.
arXiv Detail & Related papers (2023-04-19T13:05:18Z) - Breaking Immutable: Information-Coupled Prototype Elaboration for
Few-Shot Object Detection [15.079980293820137]
We propose an Information-Coupled Prototype Elaboration (ICPE) method to generate specific and representative prototypes for each query image.
Our method achieves state-of-the-art performance in almost all settings.
arXiv Detail & Related papers (2022-11-27T10:33:11Z) - Intermediate Prototype Mining Transformer for Few-Shot Semantic
Segmentation [119.51445225693382]
Few-shot semantic segmentation aims to segment the target objects in query under the condition of a few annotated support images.
We introduce an intermediate prototype for mining both deterministic category information from the support and adaptive category knowledge from the query.
In each IPMT layer, we propagate the object information in both support and query features to the prototype and then use it to activate the query feature map.
arXiv Detail & Related papers (2022-10-13T06:45:07Z) - Self-Support Few-Shot Semantic Segmentation [72.43667576285445]
We propose a novel self-support matching strategy, which uses query prototypes to match query features.
We also propose an adaptive self-support background prototype generation module and self-support loss to further facilitate the self-support matching procedure.
Our self-support network substantially improves the prototype quality, benefits more improvement from stronger backbones and more supports, and achieves SOTA on multiple datasets.
arXiv Detail & Related papers (2022-07-23T16:28:07Z) - APANet: Adaptive Prototypes Alignment Network for Few-Shot Semantic
Segmentation [56.387647750094466]
Few-shot semantic segmentation aims to segment novel-class objects in a given query image with only a few labeled support images.
Most advanced solutions exploit a metric learning framework that performs segmentation through matching each query feature to a learned class-specific prototype.
We present an adaptive prototype representation by introducing class-specific and class-agnostic prototypes.
arXiv Detail & Related papers (2021-11-24T04:38:37Z) - Few-shot Segmentation with Optimal Transport Matching and Message Flow [50.9853556696858]
It is essential for few-shot semantic segmentation to fully utilize the support information.
We propose a Correspondence Matching Network (CMNet) with an Optimal Transport Matching module.
Experiments on PASCAL VOC 2012, MS COCO, and FSS-1000 datasets show that our network achieves new state-of-the-art few-shot segmentation performance.
arXiv Detail & Related papers (2021-08-19T06:26:11Z) - Few-Shot Segmentation via Cycle-Consistent Transformer [74.49307213431952]
We focus on utilizing pixel-wise relationships between support and target images to facilitate the few-shot semantic segmentation task.
We propose using a novel cycle-consistent attention mechanism to filter out possible harmful support features.
Our proposed CyCTR leads to remarkable improvement compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-06-04T07:57:48Z) - SimPropNet: Improved Similarity Propagation for Few-shot Image
Segmentation [14.419517737536706]
Recent deep neural network based FSS methods leverage high-dimensional feature similarity between the foreground features of the support images and the query image features.
We propose to jointly predict the support and query masks to force the support features to share characteristics with the query features.
Our method achieves state-of-the-art results for one-shot and five-shot segmentation on the PASCAL-5i dataset.
arXiv Detail & Related papers (2020-04-30T17:56:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.