MIANet: Aggregating Unbiased Instance and General Information for
Few-Shot Semantic Segmentation
- URL: http://arxiv.org/abs/2305.13864v1
- Date: Tue, 23 May 2023 09:36:27 GMT
- Title: MIANet: Aggregating Unbiased Instance and General Information for
Few-Shot Semantic Segmentation
- Authors: Yong Yang and Qiong Chen and Yuan Feng and Tianlin Huang
- Abstract summary: Existing few-shot segmentation methods are based on the meta-learning strategy and extract instance knowledge from a support set.
We propose a multi-information aggregation network (MIANet) that effectively leverages the general knowledge, i.e., semantic word embeddings, and instance information for accurate segmentation.
Experiments on PASCAL-5i and COCO-20i show that MIANet yields superior performance and set a new state-of-the-art.
- Score: 6.053853367809978
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Existing few-shot segmentation methods are based on the meta-learning
strategy and extract instance knowledge from a support set and then apply the
knowledge to segment target objects in a query set. However, the extracted
knowledge is insufficient to cope with the variable intra-class differences
since the knowledge is obtained from a few samples in the support set. To
address the problem, we propose a multi-information aggregation network
(MIANet) that effectively leverages the general knowledge, i.e., semantic word
embeddings, and instance information for accurate segmentation. Specifically,
in MIANet, a general information module (GIM) is proposed to extract a general
class prototype from word embeddings as a supplement to instance information.
To this end, we design a triplet loss that treats the general class prototype
as an anchor and samples positive-negative pairs from local features in the
support set. The calculated triplet loss can transfer semantic similarities
among language identities from a word embedding space to a visual
representation space. To alleviate the model biasing towards the seen training
classes and to obtain multi-scale information, we then introduce a
non-parametric hierarchical prior module (HPM) to generate unbiased
instance-level information via calculating the pixel-level similarity between
the support and query image features. Finally, an information fusion module
(IFM) combines the general and instance information to make predictions for the
query image. Extensive experiments on PASCAL-5i and COCO-20i show that MIANet
yields superior performance and set a new state-of-the-art. Code is available
at https://github.com/Aldrich2y/MIANet.
Related papers
- SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding [56.079013202051094]
We present SegVG, a novel method transfers the box-level annotation as signals to provide an additional pixel-level supervision for Visual Grounding.
This approach allows us to iteratively exploit the annotation as signals for both box-level regression and pixel-level segmentation.
arXiv Detail & Related papers (2024-07-03T15:30:45Z) - MatchSeg: Towards Better Segmentation via Reference Image Matching [5.55078598520531]
Few-shot learning aims to overcome the need for annotated data by using a small labeled dataset, known as a support set, to guide predicting labels for new, unlabeled images.
Inspired by this paradigm, we introduce MatchSeg, a novel framework that enhances medical image segmentation through strategic reference image matching.
arXiv Detail & Related papers (2024-03-23T18:04:58Z) - De-coupling and De-positioning Dense Self-supervised Learning [65.56679416475943]
Dense Self-Supervised Learning (SSL) methods address the limitations of using image-level feature representations when handling images with multiple objects.
We show that they suffer from coupling and positional bias, which arise from the receptive field increasing with layer depth and zero-padding.
We demonstrate the benefits of our method on COCO and on a new challenging benchmark, OpenImage-MINI, for object classification, semantic segmentation, and object detection.
arXiv Detail & Related papers (2023-03-29T18:07:25Z) - Intermediate Prototype Mining Transformer for Few-Shot Semantic
Segmentation [119.51445225693382]
Few-shot semantic segmentation aims to segment the target objects in query under the condition of a few annotated support images.
We introduce an intermediate prototype for mining both deterministic category information from the support and adaptive category knowledge from the query.
In each IPMT layer, we propagate the object information in both support and query features to the prototype and then use it to activate the query feature map.
arXiv Detail & Related papers (2022-10-13T06:45:07Z) - Learning to Detect Instance-level Salient Objects Using Complementary
Image Labels [55.049347205603304]
We present the first weakly-supervised approach to the salient instance detection problem.
We propose a novel weakly-supervised network with three branches: a Saliency Detection Branch leveraging class consistency information to locate candidate objects; a Boundary Detection Branch exploiting class discrepancy information to delineate object boundaries; and a Centroid Detection Branch using subitizing information to detect salient instance centroids.
arXiv Detail & Related papers (2021-11-19T10:15:22Z) - Robust 3D Scene Segmentation through Hierarchical and Learnable
Part-Fusion [9.275156524109438]
3D semantic segmentation is a fundamental building block for several scene understanding applications such as autonomous driving, robotics and AR/VR.
Previous methods have utilized hierarchical, iterative methods to fuse semantic and instance information, but they lack learnability in context fusion.
This paper presents Segment-Fusion, a novel attention-based method for hierarchical fusion of semantic and instance information.
arXiv Detail & Related papers (2021-11-16T13:14:47Z) - Deep Relational Metric Learning [84.95793654872399]
This paper presents a deep relational metric learning framework for image clustering and retrieval.
We learn an ensemble of features that characterizes an image from different aspects to model both interclass and intraclass distributions.
Experiments on the widely-used CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate that our framework improves existing deep metric learning methods and achieves very competitive results.
arXiv Detail & Related papers (2021-08-23T09:31:18Z) - Boosting Few-shot Semantic Segmentation with Transformers [81.43459055197435]
TRansformer-based Few-shot Semantic segmentation method (TRFS)
Our model consists of two modules: Global Enhancement Module (GEM) and Local Enhancement Module (LEM)
arXiv Detail & Related papers (2021-08-04T20:09:21Z) - Learn to Learn Metric Space for Few-Shot Segmentation of 3D Shapes [17.217954254022573]
We introduce a meta-learning-based method for few-shot 3D shape segmentation where only a few labeled samples are provided for the unseen classes.
We demonstrate the superior performance of our proposed on the ShapeNet part dataset under the few-shot scenario, compared with well-established baseline and state-of-the-art semi-supervised methods.
arXiv Detail & Related papers (2021-07-07T01:47:00Z) - Remote Sensing Images Semantic Segmentation with General Remote Sensing
Vision Model via a Self-Supervised Contrastive Learning Method [13.479068312825781]
We propose Global style and Local matching Contrastive Learning Network (GLCNet) for remote sensing semantic segmentation.
Specifically, the global style contrastive module is used to learn an image-level representation better.
The local features matching contrastive module is designed to learn representations of local regions which is beneficial for semantic segmentation.
arXiv Detail & Related papers (2021-06-20T03:03:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.