Feature-Proxy Transformer for Few-Shot Segmentation
        - URL: http://arxiv.org/abs/2210.06908v1
- Date: Thu, 13 Oct 2022 11:22:27 GMT
- Title: Feature-Proxy Transformer for Few-Shot Segmentation
- Authors: Jian-Wei Zhang, Yifan Sun, Yi Yang, Wei Chen
- Abstract summary: Few-shot segmentation (FSS) aims at performing semantic segmentation on novel classes given a few annotated support samples.
This paper proposes a novel Feature- Proxy Transformer (FPTrans) method, in which the "proxy" is the vector representing a semantic class in the linear classification head.
Although the framework is straightforward, we show that FPTrans achieves competitive FSS accuracy on par with state-of-the-art decoder-based methods.
- Score: 35.85575258482071
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Few-shot segmentation (FSS) aims at performing semantic segmentation on novel
classes given a few annotated support samples. With a rethink of recent
advances, we find that the current FSS framework has deviated far from the
supervised segmentation framework: Given the deep features, FSS methods
typically use an intricate decoder to perform sophisticated pixel-wise
matching, while the supervised segmentation methods use a simple linear
classification head. Due to the intricacy of the decoder and its matching
pipeline, it is not easy to follow such an FSS framework. This paper revives
the straightforward framework of "feature extractor $+$ linear classification
head" and proposes a novel Feature-Proxy Transformer (FPTrans) method, in which
the "proxy" is the vector representing a semantic class in the linear
classification head. FPTrans has two keypoints for learning discriminative
features and representative proxies: 1) To better utilize the limited support
samples, the feature extractor makes the query interact with the support
features from the bottom to top layers using a novel prompting strategy. 2)
FPTrans uses multiple local background proxies (instead of a single one)
because the background is not homogeneous and may contain some novel foreground
regions. These two keypoints are easily integrated into the vision transformer
backbone with the prompting mechanism in the transformer. Given the learned
features and proxies, FPTrans directly compares their cosine similarity for
segmentation. Although the framework is straightforward, we show that FPTrans
achieves competitive FSS accuracy on par with state-of-the-art decoder-based
methods.
 
      
        Related papers
        - CMFDFormer: Transformer-based Copy-Move Forgery Detection with Continual
  Learning [52.72888626663642]
 Copy-move forgery detection aims at detecting duplicated regions in a suspected forged image.
Deep learning based copy-move forgery detection methods are in the ascendant.
We propose a Transformer-style copy-move forgery network named as CMFDFormer.
We also provide a novel PCSD continual learning framework to help CMFDFormer handle new tasks.
 arXiv  Detail & Related papers  (2023-11-22T09:27:46Z)
- Retro-FPN: Retrospective Feature Pyramid Network for Point Cloud
  Semantic Segmentation [65.78483246139888]
 We propose Retro-FPN to model the per-point feature prediction as an explicit and retrospective refining process.
Its key novelty is a retro-transformer for summarizing semantic contexts from the previous layer.
We show that Retro-FPN can significantly improve performance over state-of-the-art backbones.
 arXiv  Detail & Related papers  (2023-08-18T05:28:25Z)
- SegT: A Novel Separated Edge-guidance Transformer Network for Polyp
  Segmentation [10.144870911523622]
 We propose a novel separated edge-guidance transformer (SegT) network that aims to build an effective polyp segmentation model.
A transformer encoder that learns a more robust representation than existing CNN-based approaches was specifically applied.
To evaluate the effectiveness of SegT, we conducted experiments with five challenging public datasets.
 arXiv  Detail & Related papers  (2023-06-19T08:32:05Z)
- SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation [94.11915008006483]
 We propose SemAffiNet for point cloud semantic segmentation.
We conduct extensive experiments on the ScanNetV2 and NYUv2 datasets.
 arXiv  Detail & Related papers  (2022-05-26T17:00:23Z)
- Beyond the Prototype: Divide-and-conquer Proxies for Few-shot
  Segmentation [63.910211095033596]
 Few-shot segmentation aims to segment unseen-class objects given only a handful of densely labeled samples.
We propose a simple yet versatile framework in the spirit of divide-and-conquer.
Our proposed approach, named divide-and-conquer proxies (DCP), allows for the development of appropriate and reliable information.
 arXiv  Detail & Related papers  (2022-04-21T06:21:14Z)
- TraSeTR: Track-to-Segment Transformer with Contrastive Query for
  Instance-level Instrument Segmentation in Robotic Surgery [60.439434751619736]
 We propose TraSeTR, a Track-to-Segment Transformer that exploits tracking cues to assist surgical instrument segmentation.
TraSeTR jointly reasons about the instrument type, location, and identity with instance-level predictions.
The effectiveness of our method is demonstrated with state-of-the-art instrument type segmentation results on three public datasets.
 arXiv  Detail & Related papers  (2022-02-17T05:52:18Z)
- Query2Label: A Simple Transformer Way to Multi-Label Classification [37.206922180245265]
 This paper presents a simple and effective approach to solving the multi-label classification problem.
The proposed approach leverages Transformer decoders to query the existence of a class label.
Compared with prior works, the new framework is simple, using standard Transformers and vision backbones, and effective.
 arXiv  Detail & Related papers  (2021-07-22T17:49:25Z)
- TrTr: Visual Tracking with Transformer [29.415900191169587]
 We propose a novel tracker network based on a powerful attention mechanism called Transformer encoder-decoder architecture.
We design the classification and regression heads using the output of Transformer to localize target based on shape-agnostic anchor.
Our method performs favorably against state-of-the-art algorithms.
 arXiv  Detail & Related papers  (2021-05-09T02:32:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.