BriNet: Towards Bridging the Intra-class and Inter-class Gaps in
One-Shot Segmentation
- URL: http://arxiv.org/abs/2008.06226v1
- Date: Fri, 14 Aug 2020 07:45:50 GMT
- Title: BriNet: Towards Bridging the Intra-class and Inter-class Gaps in
One-Shot Segmentation
- Authors: Xianghui Yang, Bairun Wang, Kaige Chen, Xinchi Zhou, Shuai Yi, Wanli
Ouyang, Luping Zhou
- Abstract summary: Few-shot segmentation focuses on the generalization of models to segment unseen object instances with limited training samples.
We propose a framework, BriNet, to bridge the gaps between the extracted features of the query and support images.
The effectiveness of our framework is demonstrated by experimental results, which outperforms other competitive methods.
- Score: 84.2925550033094
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot segmentation focuses on the generalization of models to segment
unseen object instances with limited training samples. Although tremendous
improvements have been achieved, existing methods are still constrained by two
factors. (1) The information interaction between query and support images is
not adequate, leaving intra-class gap. (2) The object categories at the
training and inference stages have no overlap, leaving the inter-class gap.
Thus, we propose a framework, BriNet, to bridge these gaps. First, more
information interactions are encouraged between the extracted features of the
query and support images, i.e., using an Information Exchange Module to
emphasize the common objects. Furthermore, to precisely localize the query
objects, we design a multi-path fine-grained strategy which is able to make
better use of the support feature representations. Second, a new online
refinement strategy is proposed to help the trained model adapt to unseen
classes, achieved by switching the roles of the query and the support images at
the inference stage. The effectiveness of our framework is demonstrated by
experimental results, which outperforms other competitive methods and leads to
a new state-of-the-art on both PASCAL VOC and MSCOCO dataset.
Related papers
- Visual Prompt Selection for In-Context Learning Segmentation [77.15684360470152]
In this paper, we focus on rethinking and improving the example selection strategy.
We first demonstrate that ICL-based segmentation models are sensitive to different contexts.
Furthermore, empirical evidence indicates that the diversity of contextual prompts plays a crucial role in guiding segmentation.
arXiv Detail & Related papers (2024-07-14T15:02:54Z) - IFSENet : Harnessing Sparse Iterations for Interactive Few-shot Segmentation Excellence [2.822194296769473]
Few-shot segmentation techniques reduce the required number of images to learn to segment a new class.
interactive segmentation techniques only focus on incrementally improving the segmentation of one object at a time.
We combine the two concepts to drastically reduce the effort required to train segmentation models for novel classes.
arXiv Detail & Related papers (2024-03-22T10:15:53Z) - Boosting Few-Shot Segmentation via Instance-Aware Data Augmentation and
Local Consensus Guided Cross Attention [7.939095881813804]
Few-shot segmentation aims to train a segmentation model that can fast adapt to a novel task for which only a few annotated images are provided.
We introduce an instance-aware data augmentation (IDA) strategy that augments the support images based on the relative sizes of the target objects.
The proposed IDA effectively increases the support set's diversity and promotes the distribution consistency between support and query images.
arXiv Detail & Related papers (2024-01-18T10:29:10Z) - Elimination of Non-Novel Segments at Multi-Scale for Few-Shot
Segmentation [0.0]
Few-shot segmentation aims to devise a generalizing model that segments query images from unseen classes during training.
We simultaneously address two vital problems for the first time and achieve state-of-the-art performances on both PASCAL-5i and COCO-20i datasets.
arXiv Detail & Related papers (2022-11-04T07:52:54Z) - A Joint Framework Towards Class-aware and Class-agnostic Alignment for
Few-shot Segmentation [11.47479526463185]
Few-shot segmentation aims to segment objects of unseen classes given only a few annotated support images.
Most existing methods simply stitch query features with independent support prototypes and segment the query image by feeding the mixed features to a decoder.
We propose a joint framework that combines more valuable class-aware and class-agnostic alignment guidance to facilitate the segmentation.
arXiv Detail & Related papers (2022-11-02T17:33:25Z) - Meta-DETR: Image-Level Few-Shot Detection with Inter-Class Correlation
Exploitation [100.87407396364137]
We design Meta-DETR, which (i) is the first image-level few-shot detector, and (ii) introduces a novel inter-class correlational meta-learning strategy.
Experiments over multiple few-shot object detection benchmarks show that the proposed Meta-DETR outperforms state-of-the-art methods by large margins.
arXiv Detail & Related papers (2022-07-30T13:46:07Z) - Temporal Saliency Query Network for Efficient Video Recognition [82.52760040577864]
Video recognition is a hot-spot research topic with the explosive growth of multimedia data on the Internet and mobile devices.
Most existing methods select the salient frames without awareness of the class-specific saliency scores.
We propose a novel Temporal Saliency Query (TSQ) mechanism, which introduces class-specific information to provide fine-grained cues for saliency measurement.
arXiv Detail & Related papers (2022-07-21T09:23:34Z) - OS-MSL: One Stage Multimodal Sequential Link Framework for Scene
Segmentation and Classification [11.707994658605546]
We propose a general One Stage Multimodal Sequential Link Framework (OS-MSL) to distinguish and leverage the two-fold semantics.
We tailor a specific module called DiffCorrNet to explicitly extract the information of differences and correlations among shots.
arXiv Detail & Related papers (2022-07-04T07:59:34Z) - Semantic Representation and Dependency Learning for Multi-Label Image
Recognition [76.52120002993728]
We propose a novel and effective semantic representation and dependency learning (SRDL) framework to learn category-specific semantic representation for each category.
Specifically, we design a category-specific attentional regions (CAR) module to generate channel/spatial-wise attention matrices to guide model.
We also design an object erasing (OE) module to implicitly learn semantic dependency among categories by erasing semantic-aware regions.
arXiv Detail & Related papers (2022-04-08T00:55:15Z) - Adversarial Bipartite Graph Learning for Video Domain Adaptation [50.68420708387015]
Domain adaptation techniques, which focus on adapting models between distributionally different domains, are rarely explored in the video recognition area.
Recent works on visual domain adaptation which leverage adversarial learning to unify the source and target video representations are not highly effective on the videos.
This paper proposes an Adversarial Bipartite Graph (ABG) learning framework which directly models the source-target interactions.
arXiv Detail & Related papers (2020-07-31T03:48:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.