Hierarchical Dense Correlation Distillation for Few-Shot
Segmentation-Extended Abstract
- URL: http://arxiv.org/abs/2306.15278v1
- Date: Tue, 27 Jun 2023 08:10:20 GMT
- Title: Hierarchical Dense Correlation Distillation for Few-Shot
Segmentation-Extended Abstract
- Authors: Bohao Peng, Zhuotao Tian, Xiaoyang Wu, Chengyao Wang, Shu Liu,
Jingyong Su, Jiaya Jia
- Abstract summary: Few-shot semantic segmentation (FSS) aims to form class-agnostic models segmenting unseen classes with only a handful of annotations.
We design Hierarchically Decoupled Matching Network (HDMNet) mining pixel-level support correlation based on the transformer architecture.
We propose a matching module to reduce train-set overfitting and introduce correlation distillation leveraging semantic correspondence from coarse resolution to boost fine-grained segmentation.
- Score: 47.85056124410376
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot semantic segmentation (FSS) aims to form class-agnostic models
segmenting unseen classes with only a handful of annotations. Previous methods
limited to the semantic feature and prototype representation suffer from coarse
segmentation granularity and train-set overfitting. In this work, we design
Hierarchically Decoupled Matching Network (HDMNet) mining pixel-level support
correlation based on the transformer architecture. The self-attention modules
are used to assist in establishing hierarchical dense features, as a means to
accomplish the cascade matching between query and support features. Moreover,
we propose a matching module to reduce train-set overfitting and introduce
correlation distillation leveraging semantic correspondence from coarse
resolution to boost fine-grained segmentation. Our method performs decently in
experiments. We achieve 50.0% mIoU on COCO dataset one-shot setting and 56.0%
on five-shot segmentation, respectively. The code will be available on the
project website. We hope our work can benefit broader industrial applications
where novel classes with limited annotations are required to be decently
identified.
Related papers
- Boosting Few-Shot Segmentation via Instance-Aware Data Augmentation and
Local Consensus Guided Cross Attention [7.939095881813804]
Few-shot segmentation aims to train a segmentation model that can fast adapt to a novel task for which only a few annotated images are provided.
We introduce an instance-aware data augmentation (IDA) strategy that augments the support images based on the relative sizes of the target objects.
The proposed IDA effectively increases the support set's diversity and promotes the distribution consistency between support and query images.
arXiv Detail & Related papers (2024-01-18T10:29:10Z) - Hierarchical Dense Correlation Distillation for Few-Shot Segmentation [46.696051965252934]
Few-shot semantic segmentation (FSS) aims to form class-agnostic models segmenting unseen classes with only a handful of annotations.
We design Hierarchically Decoupled Matching Network (HDMNet) mining pixel-level support correlation based on the transformer architecture.
We propose a matching module to reduce train-set overfitting and introduce correlation distillation leveraging semantic correspondence from coarse resolution to boost fine-grained segmentation.
arXiv Detail & Related papers (2023-03-26T08:13:12Z) - Contrastive Enhancement Using Latent Prototype for Few-Shot Segmentation [8.986743262828009]
Few-shot segmentation enables the model to recognize unseen classes with few annotated examples.
This paper proposes a contrastive enhancement approach using latent prototypes to leverage latent classes.
Our approach remarkably improves the performance of state-of-the-art methods for 1-shot and 5-shot segmentation.
arXiv Detail & Related papers (2022-03-08T14:02:32Z) - Cost Aggregation Is All You Need for Few-Shot Segmentation [28.23753949369226]
We introduce Volumetric Aggregation with Transformers (VAT) to tackle the few-shot segmentation task.
VAT uses both convolutions and transformers to efficiently handle high dimensional correlation maps between query and support.
We find that the proposed method attains state-of-the-art performance even for the standard benchmarks in semantic correspondence task.
arXiv Detail & Related papers (2021-12-22T06:18:51Z) - SCNet: Enhancing Few-Shot Semantic Segmentation by Self-Contrastive
Background Prototypes [56.387647750094466]
Few-shot semantic segmentation aims to segment novel-class objects in a query image with only a few annotated examples.
Most of advanced solutions exploit a metric learning framework that performs segmentation through matching each pixel to a learned foreground prototype.
This framework suffers from biased classification due to incomplete construction of sample pairs with the foreground prototype only.
arXiv Detail & Related papers (2021-04-19T11:21:47Z) - Deep Gaussian Processes for Few-Shot Segmentation [66.08463078545306]
Few-shot segmentation is a challenging task, requiring the extraction of a generalizable representation from only a few annotated samples.
We propose a few-shot learner formulation based on Gaussian process (GP) regression.
Our approach sets a new state-of-the-art for 5-shot segmentation, with mIoU scores of 68.1 and 49.8 on PASCAL-5i and COCO-20i, respectively.
arXiv Detail & Related papers (2021-03-30T17:56:32Z) - Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z) - Self-Supervised Tuning for Few-Shot Segmentation [82.32143982269892]
Few-shot segmentation aims at assigning a category label to each image pixel with few annotated samples.
Existing meta-learning method tends to fail in generating category-specifically discriminative descriptor when the visual features extracted from support images are marginalized in embedding space.
This paper presents an adaptive framework tuning, in which the distribution of latent features across different episodes is dynamically adjusted based on a self-segmentation scheme.
arXiv Detail & Related papers (2020-04-12T03:53:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.