Related papers: Self-Calibrated Cross Attention Network for Few-Shot Segmentation

Self-Calibrated Cross Attention Network for Few-Shot Segmentation

URL: http://arxiv.org/abs/2308.09294v1
Date: Fri, 18 Aug 2023 04:41:50 GMT
Title: Self-Calibrated Cross Attention Network for Few-Shot Segmentation
Authors: Qianxiong Xu, Wenting Zhao, Guosheng Lin, Cheng Long
Abstract summary: We design a self-calibrated cross attention (SCCA) block for efficient patch-based attention. SCCA groups the patches from the same query image and the aligned patches from the support image as K&V. In this way, the query BG features are fused with matched BG features in support FG, and thus the aforementioned issues will be mitigated.
Score: 65.20559109791756
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The key to the success of few-shot segmentation (FSS) lies in how to effectively utilize support samples. Most solutions compress support foreground (FG) features into prototypes, but lose some spatial details. Instead, others use cross attention to fuse query features with uncompressed support FG. Query FG could be fused with support FG, however, query background (BG) cannot find matched BG features in support FG, yet inevitably integrates dissimilar features. Besides, as both query FG and BG are combined with support FG, they get entangled, thereby leading to ineffective segmentation. To cope with these issues, we design a self-calibrated cross attention (SCCA) block. For efficient patch-based attention, query and support features are firstly split into patches. Then, we design a patch alignment module to align each query patch with its most similar support patch for better cross attention. Specifically, SCCA takes a query patch as Q, and groups the patches from the same query image and the aligned patches from the support image as K&V. In this way, the query BG features are fused with matched BG features (from query patches), and thus the aforementioned issues will be mitigated. Moreover, when calculating SCCA, we design a scaled-cosine mechanism to better utilize the support features for similarity calculation. Extensive experiments conducted on PASCAL-5^i and COCO-20^i demonstrate the superiority of our model, e.g., the mIoU score under 5-shot setting on COCO-20^i is 5.6%+ better than previous state-of-the-arts. The code is available at https://github.com/Sam1224/SCCAN.

Related papers

Recurrent Feature Mining and Keypoint Mixup Padding for Category-Agnostic Pose Estimation [33.204232825380394]
Category-agnostic pose estimation aims to locate keypoints on query images according to a few annotated support images for arbitrary novel classes. We propose a novel yet concise framework, which recurrently mines FGSA features from both support and query images.
arXiv Detail & Related papers (2025-03-27T04:09:13Z)
Hybrid Mamba for Few-Shot Segmentation [54.562050590453225]
Few-shot segmentation (FSS) methods use cross attention to fuse support foreground (FG) into query features, regardless of the quadratic complexity. We aim to devise a cross (attention-like) Mamba to capture inter-sequence dependencies for FSS. A simple idea is to scan on support features to selectively compress them into the hidden state, which is then used as the initial hidden state to sequentially scan query features.
arXiv Detail & Related papers (2024-09-29T08:51:14Z)
Eliminating Feature Ambiguity for Few-Shot Segmentation [95.9916573435427]
Recent advancements in few-shot segmentation (FSS) have exploited pixel-by-pixel matching between query and support features. This paper presents a novel plug-in termed ambiguity elimination network (AENet), which can be plugged into any existing cross attention-based FSS methods.
arXiv Detail & Related papers (2024-07-13T10:33:03Z)
Dense Affinity Matching for Few-Shot Segmentation [83.65203917246745]
Few-Shot (FSS) aims to segment the novel class images with a few samples. We propose a dense affinity matching framework to exploit the support-query interaction. We show that our framework performs very competitively under different settings with only 0.68M parameters.
arXiv Detail & Related papers (2023-07-17T12:27:15Z)
Enhancing Few-shot Image Classification with Cosine Transformer [4.511561231517167]
Few-shot Cosine Transformer (FS-CT) is a relational map between supports and queries. Our method performs competitive results in mini-ImageNet, CUB-200, and CIFAR-FS on 1-shot learning and 5-shot learning tasks. Our FS-CT with cosine attention is a lightweight, simple few-shot algorithm that can be applied for a wide range of applications.
arXiv Detail & Related papers (2022-11-13T06:03:28Z)
Dense Cross-Query-and-Support Attention Weighted Mask Aggregation for Few-Shot Segmentation [25.605580031284052]
Few-shot Semantic Dense (FSS) has attracted great attention. The goal of FSS is to segment target objects in a query image given only a few annotated support images of the target class. We propose pixel-wise Cross-query-and-support Attention weighted Mask Aggregation (AMADC), where both foreground and background support information are fully exploited.
arXiv Detail & Related papers (2022-07-18T12:12:42Z)
Dense Gaussian Processes for Few-Shot Segmentation [66.08463078545306]
We propose a few-shot segmentation method based on dense Gaussian process (GP) regression. We exploit the end-to-end learning capabilities of our approach to learn a high-dimensional output space for the GP. Our approach sets a new state-of-the-art for both 1-shot and 5-shot FSS on the PASCAL-5$i$ and COCO-20$i$ benchmarks.
arXiv Detail & Related papers (2021-10-07T17:57:54Z)
Few-Shot Segmentation via Cycle-Consistent Transformer [74.49307213431952]
We focus on utilizing pixel-wise relationships between support and target images to facilitate the few-shot semantic segmentation task. We propose using a novel cycle-consistent attention mechanism to filter out possible harmful support features. Our proposed CyCTR leads to remarkable improvement compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-06-04T07:57:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.