Dynamic Prototype Convolution Network for Few-Shot Semantic Segmentation
        - URL: http://arxiv.org/abs/2204.10638v1
- Date: Fri, 22 Apr 2022 11:12:37 GMT
- Title: Dynamic Prototype Convolution Network for Few-Shot Semantic Segmentation
- Authors: Jie Liu, Yanqi Bao, Guo-Sen Xie, Huan Xiong, Jan-Jakob Sonke,
  Efstratios Gavves
- Abstract summary: Key challenge for few-shot semantic segmentation (FSS) is how to tailor a desirable interaction among support and query features.
We propose a prototype prototype convolution network (DPCN) to fully capture the intrinsic details for accurate FSS.
Our DPCN is also flexible and efficient under the k-shot FSS setting.
- Score: 33.93192093090601
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   The key challenge for few-shot semantic segmentation (FSS) is how to tailor a
desirable interaction among support and query features and/or their prototypes,
under the episodic training scenario. Most existing FSS methods implement such
support-query interactions by solely leveraging plain operations - e.g., cosine
similarity and feature concatenation - for segmenting the query objects.
However, these interaction approaches usually cannot well capture the intrinsic
object details in the query images that are widely encountered in FSS, e.g., if
the query object to be segmented has holes and slots, inaccurate segmentation
almost always happens. To this end, we propose a dynamic prototype convolution
network (DPCN) to fully capture the aforementioned intrinsic details for
accurate FSS. Specifically, in DPCN, a dynamic convolution module (DCM) is
firstly proposed to generate dynamic kernels from support foreground, then
information interaction is achieved by convolution operations over query
features using these kernels. Moreover, we equip DPCN with a support activation
module (SAM) and a feature filtering module (FFM) to generate pseudo mask and
filter out background information for the query images, respectively. SAM and
FFM together can mine enriched context information from the query features. Our
DPCN is also flexible and efficient under the k-shot FSS setting. Extensive
experiments on PASCAL-5i and COCO-20i show that DPCN yields superior
performances under both 1-shot and 5-shot settings.
 
      
        Related papers
        - CMaP-SAM: Contraction Mapping Prior for SAM-driven Few-shot Segmentation [21.466035540502226]
 Few-shot segmentation (FSS) aims to segment new classes using few annotated images.
Recent FSS methods have shown considerable improvements by leveraging Segment Anything Model (SAM)
We propose CMaP-SAM, a novel framework that introduces contraction mapping theory to optimize position priors for SAM-driven FSS.
 arXiv  Detail & Related papers  (2025-04-07T13:19:16Z)
- Dense Affinity Matching for Few-Shot Segmentation [83.65203917246745]
 Few-Shot (FSS) aims to segment the novel class images with a few samples.
We propose a dense affinity matching framework to exploit the support-query interaction.
We show that our framework performs very competitively under different settings with only 0.68M parameters.
 arXiv  Detail & Related papers  (2023-07-17T12:27:15Z)
- Holistic Prototype Attention Network for Few-Shot VOS [74.25124421163542]
 Few-shot video object segmentation (FSVOS) aims to segment dynamic objects of unseen classes by resorting to a small set of support images.
We propose a holistic prototype attention network (HPAN) for advancing FSVOS.
 arXiv  Detail & Related papers  (2023-07-16T03:48:57Z)
- Few-shot Semantic Segmentation with Support-induced Graph Convolutional
  Network [28.46908214462594]
 Few-shot semantic segmentation (FSS) aims to achieve novel objects segmentation with only a few annotated samples.
We propose a Support-induced Graph Convolutional Network (SiGCN) to explicitly excavate latent context structure in query images.
 arXiv  Detail & Related papers  (2023-01-09T08:00:01Z)
- Prototype as Query for Few Shot Semantic Segmentation [7.380266341356485]
 Few-shot Semantic (FSS) was proposed to segment unseen classes in a query image, referring to only a few examples named support images.
We propose a framework built upon Transformer termed as ProtoFormer to fully capture spatial details in query features.
 arXiv  Detail & Related papers  (2022-11-27T08:41:50Z)
- Progressively Dual Prior Guided Few-shot Semantic Segmentation [57.37506990980975]
 Few-shot semantic segmentation task aims at performing segmentation in query images with a few annotated support samples.
We propose a progressively dual prior guided few-shot semantic segmentation network.
 arXiv  Detail & Related papers  (2022-11-20T16:19:47Z)
- Dynamic Focus-aware Positional Queries for Semantic Segmentation [94.6834904076914]
 We propose a simple yet effective query design for semantic segmentation termed Dynamic Focus-aware Positional Queries.
Our framework achieves SOTA performance and outperforms Mask2former by clear margins of 1.1%, 1.9%, and 1.1% single-scale mIoU with ResNet-50, Swin-T, and Swin-B backbones.
 arXiv  Detail & Related papers  (2022-04-04T05:16:41Z)
- Boosting Few-shot Semantic Segmentation with Transformers [81.43459055197435]
 TRansformer-based Few-shot Semantic segmentation method (TRFS)
Our model consists of two modules: Global Enhancement Module (GEM) and Local Enhancement Module (LEM)
 arXiv  Detail & Related papers  (2021-08-04T20:09:21Z)
- Deep feature selection-and-fusion for RGB-D semantic segmentation [8.831857715361624]
 This work proposes a unified and efficient feature selectionand-fusion network (FSFNet)
FSFNet contains a symmetric cross-modality residual fusion module used for explicit fusion of multi-modality information.
Compared with the state-of-the-art methods, experimental evaluations demonstrate that the proposed model achieves competitive performance on two public datasets.
 arXiv  Detail & Related papers  (2021-05-10T04:02:32Z)
- Searching Central Difference Convolutional Networks for Face
  Anti-Spoofing [68.77468465774267]
 Face anti-spoofing (FAS) plays a vital role in face recognition systems.
Most state-of-the-art FAS methods rely on stacked convolutions and expert-designed network.
Here we propose a novel frame level FAS method based on Central Difference Convolution (CDC)
 arXiv  Detail & Related papers  (2020-03-09T12:48:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.