Few-Shot Segmentation via Cycle-Consistent Transformer
- URL: http://arxiv.org/abs/2106.02320v1
- Date: Fri, 4 Jun 2021 07:57:48 GMT
- Title: Few-Shot Segmentation via Cycle-Consistent Transformer
- Authors: Gengwei Zhang, Guoliang Kang, Yunchao Wei, Yi Yang
- Abstract summary: We focus on utilizing pixel-wise relationships between support and target images to facilitate the few-shot semantic segmentation task.
We propose using a novel cycle-consistent attention mechanism to filter out possible harmful support features.
Our proposed CyCTR leads to remarkable improvement compared to previous state-of-the-art methods.
- Score: 74.49307213431952
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot segmentation aims to train a segmentation model that can fast adapt
to novel classes with few exemplars. The conventional training paradigm is to
learn to make predictions on query images conditioned on the features from
support images. Previous methods only utilized the semantic-level prototypes of
support images as the conditional information. These methods cannot utilize all
pixel-wise support information for the query predictions, which is however
critical for the segmentation task. In this paper, we focus on utilizing
pixel-wise relationships between support and target images to facilitate the
few-shot semantic segmentation task. We design a novel Cycle-Consistent
Transformer (CyCTR) module to aggregate pixel-wise support features into query
ones. CyCTR performs cross-attention between features from different images,
i.e. support and query images. We observe that there may exist unexpected
irrelevant pixel-level support features. Directly performing cross-attention
may aggregate these features from support to query and bias the query features.
Thus, we propose using a novel cycle-consistent attention mechanism to filter
out possible harmful support features and encourage query features to attend to
the most informative pixels from support images. Experiments on all few-shot
segmentation benchmarks demonstrate that our proposed CyCTR leads to remarkable
improvement compared to previous state-of-the-art methods. Specifically, on
Pascal-$5^i$ and COCO-$20^i$ datasets, we achieve 66.6% and 45.6% mIoU for
5-shot segmentation, outperforming previous state-of-the-art by 4.6% and 7.1%
respectively.
Related papers
- Boosting Few-Shot Segmentation via Instance-Aware Data Augmentation and
Local Consensus Guided Cross Attention [7.939095881813804]
Few-shot segmentation aims to train a segmentation model that can fast adapt to a novel task for which only a few annotated images are provided.
We introduce an instance-aware data augmentation (IDA) strategy that augments the support images based on the relative sizes of the target objects.
The proposed IDA effectively increases the support set's diversity and promotes the distribution consistency between support and query images.
arXiv Detail & Related papers (2024-01-18T10:29:10Z) - Few-shot Medical Image Segmentation via Cross-Reference Transformer [3.2634122554914]
Few-shot segmentation(FSS) has the potential to address these challenges by learning new categories from a small number of labeled samples.
We propose a novel self-supervised few shot medical image segmentation network with Cross-Reference Transformer.
Experimental results show that the proposed model achieves good results on both CT dataset and MRI dataset.
arXiv Detail & Related papers (2023-04-19T13:05:18Z) - Prototype as Query for Few Shot Semantic Segmentation [7.380266341356485]
Few-shot Semantic (FSS) was proposed to segment unseen classes in a query image, referring to only a few examples named support images.
We propose a framework built upon Transformer termed as ProtoFormer to fully capture spatial details in query features.
arXiv Detail & Related papers (2022-11-27T08:41:50Z) - Dense Gaussian Processes for Few-Shot Segmentation [66.08463078545306]
We propose a few-shot segmentation method based on dense Gaussian process (GP) regression.
We exploit the end-to-end learning capabilities of our approach to learn a high-dimensional output space for the GP.
Our approach sets a new state-of-the-art for both 1-shot and 5-shot FSS on the PASCAL-5$i$ and COCO-20$i$ benchmarks.
arXiv Detail & Related papers (2021-10-07T17:57:54Z) - Few-shot Segmentation with Optimal Transport Matching and Message Flow [50.9853556696858]
It is essential for few-shot semantic segmentation to fully utilize the support information.
We propose a Correspondence Matching Network (CMNet) with an Optimal Transport Matching module.
Experiments on PASCAL VOC 2012, MS COCO, and FSS-1000 datasets show that our network achieves new state-of-the-art few-shot segmentation performance.
arXiv Detail & Related papers (2021-08-19T06:26:11Z) - Few-Shot Segmentation with Global and Local Contrastive Learning [51.677179037590356]
We propose a prior extractor to learn the query information from the unlabeled images with our proposed global-local contrastive learning.
We generate the prior region maps for query images, which locate the objects, as guidance to perform cross interaction with support features.
Without bells and whistles, the proposed approach achieves new state-of-the-art performance for the few-shot segmentation task.
arXiv Detail & Related papers (2021-08-11T15:52:22Z) - Learning Meta-class Memory for Few-Shot Semantic Segmentation [90.28474742651422]
We introduce the concept of meta-class, which is the meta information shareable among all classes.
We propose a novel Meta-class Memory based few-shot segmentation method (MM-Net), where we introduce a set of learnable memory embeddings.
Our proposed MM-Net achieves 37.5% mIoU on the COCO dataset in 1-shot setting, which is 5.1% higher than the previous state-of-the-art.
arXiv Detail & Related papers (2021-08-06T06:29:59Z) - SCNet: Enhancing Few-Shot Semantic Segmentation by Self-Contrastive
Background Prototypes [56.387647750094466]
Few-shot semantic segmentation aims to segment novel-class objects in a query image with only a few annotated examples.
Most of advanced solutions exploit a metric learning framework that performs segmentation through matching each pixel to a learned foreground prototype.
This framework suffers from biased classification due to incomplete construction of sample pairs with the foreground prototype only.
arXiv Detail & Related papers (2021-04-19T11:21:47Z) - Self-Guided and Cross-Guided Learning for Few-Shot Segmentation [12.899804391102435]
We propose a self-guided learning approach for few-shot segmentation.
By making an initial prediction for the annotated support image, the covered and uncovered foreground regions are encoded to the primary and auxiliary support vectors.
By aggregating both primary and auxiliary support vectors, better segmentation performances are obtained on query images.
arXiv Detail & Related papers (2021-03-30T07:36:41Z) - SimPropNet: Improved Similarity Propagation for Few-shot Image
Segmentation [14.419517737536706]
Recent deep neural network based FSS methods leverage high-dimensional feature similarity between the foreground features of the support images and the query image features.
We propose to jointly predict the support and query masks to force the support features to share characteristics with the query features.
Our method achieves state-of-the-art results for one-shot and five-shot segmentation on the PASCAL-5i dataset.
arXiv Detail & Related papers (2020-04-30T17:56:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.