Boosting Few-shot Semantic Segmentation with Transformers
- URL: http://arxiv.org/abs/2108.02266v1
- Date: Wed, 4 Aug 2021 20:09:21 GMT
- Title: Boosting Few-shot Semantic Segmentation with Transformers
- Authors: Guolei Sun, Yun Liu, Jingyun Liang, Luc Van Gool
- Abstract summary: TRansformer-based Few-shot Semantic segmentation method (TRFS)
Our model consists of two modules: Global Enhancement Module (GEM) and Local Enhancement Module (LEM)
- Score: 81.43459055197435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the fact that fully supervised semantic segmentation methods require
sufficient fully-labeled data to work well and can not generalize to unseen
classes, few-shot segmentation has attracted lots of research attention.
Previous arts extract features from support and query images, which are
processed jointly before making predictions on query images. The whole process
is based on convolutional neural networks (CNN), leading to the problem that
only local information is used. In this paper, we propose a TRansformer-based
Few-shot Semantic segmentation method (TRFS). Specifically, our model consists
of two modules: Global Enhancement Module (GEM) and Local Enhancement Module
(LEM). GEM adopts transformer blocks to exploit global information, while LEM
utilizes conventional convolutions to exploit local information, across query
and support features. Both GEM and LEM are complementary, helping to learn
better feature representations for segmenting query images. Extensive
experiments on PASCAL-5i and COCO datasets show that our approach achieves new
state-of-the-art performance, demonstrating its effectiveness.
Related papers
- MIANet: Aggregating Unbiased Instance and General Information for
Few-Shot Semantic Segmentation [6.053853367809978]
Existing few-shot segmentation methods are based on the meta-learning strategy and extract instance knowledge from a support set.
We propose a multi-information aggregation network (MIANet) that effectively leverages the general knowledge, i.e., semantic word embeddings, and instance information for accurate segmentation.
Experiments on PASCAL-5i and COCO-20i show that MIANet yields superior performance and set a new state-of-the-art.
arXiv Detail & Related papers (2023-05-23T09:36:27Z) - LoG-CAN: local-global Class-aware Network for semantic segmentation of
remote sensing images [4.124381172041927]
We present LoG-CAN, a multi-scale semantic segmentation network with a global class-aware (GCA) module and local class-aware (LCA) modules to remote sensing images.
Specifically, the GCA module captures the global representations of class-wise context modeling to circumvent background interference; the LCA modules generate local class representations as intermediate aware elements, indirectly associating pixels with global class representations to reduce variance within a class.
arXiv Detail & Related papers (2023-03-14T09:44:29Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - CRCNet: Few-shot Segmentation with Cross-Reference and Region-Global
Conditional Networks [59.85183776573642]
Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images.
We propose a Cross-Reference and Local-Global Networks (CRCNet) for few-shot segmentation.
Our network can better find the co-occurrent objects in the two images with a cross-reference mechanism.
arXiv Detail & Related papers (2022-08-23T06:46:18Z) - Instance Segmentation of Unlabeled Modalities via Cyclic Segmentation
GAN [27.936725483892076]
We propose a novel Cyclic Generative Adrial Network (CySGAN) that conducts image translation and instance segmentation jointly.
We benchmark our approach on the task of 3D neuronal nuclei segmentation with annotated electron microscopy (EM) images and unlabeled expansion microscopy (ExM) data.
arXiv Detail & Related papers (2022-04-06T20:46:39Z) - Few-shot Segmentation with Optimal Transport Matching and Message Flow [50.9853556696858]
It is essential for few-shot semantic segmentation to fully utilize the support information.
We propose a Correspondence Matching Network (CMNet) with an Optimal Transport Matching module.
Experiments on PASCAL VOC 2012, MS COCO, and FSS-1000 datasets show that our network achieves new state-of-the-art few-shot segmentation performance.
arXiv Detail & Related papers (2021-08-19T06:26:11Z) - Remote Sensing Images Semantic Segmentation with General Remote Sensing
Vision Model via a Self-Supervised Contrastive Learning Method [13.479068312825781]
We propose Global style and Local matching Contrastive Learning Network (GLCNet) for remote sensing semantic segmentation.
Specifically, the global style contrastive module is used to learn an image-level representation better.
The local features matching contrastive module is designed to learn representations of local regions which is beneficial for semantic segmentation.
arXiv Detail & Related papers (2021-06-20T03:03:40Z) - Referring Image Segmentation via Cross-Modal Progressive Comprehension [94.70482302324704]
Referring image segmentation aims at segmenting the foreground masks of the entities that can well match the description given in the natural language expression.
Previous approaches tackle this problem using implicit feature interaction and fusion between visual and linguistic modalities.
We propose a Cross-Modal Progressive (CMPC) module and a Text-Guided Feature Exchange (TGFE) module to effectively address the challenging task.
arXiv Detail & Related papers (2020-10-01T16:02:30Z) - Prototype Mixture Models for Few-shot Semantic Segmentation [50.866870384596446]
Few-shot segmentation is challenging because objects within the support and query images could significantly differ in appearance and pose.
We propose prototype mixture models (PMMs), which correlate diverse image regions with multiple prototypes to enforce the prototype-based semantic representation.
PMMs improve 5-shot segmentation performance on MS-COCO by up to 5.82% with only a moderate cost for model size and inference speed.
arXiv Detail & Related papers (2020-08-10T04:33:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.