Progressively Dual Prior Guided Few-shot Semantic Segmentation
- URL: http://arxiv.org/abs/2211.15467v1
- Date: Sun, 20 Nov 2022 16:19:47 GMT
- Title: Progressively Dual Prior Guided Few-shot Semantic Segmentation
- Authors: Qinglong Cao, Yuntian Chen, Xiwen Yao, Junwei Han
- Abstract summary: Few-shot semantic segmentation task aims at performing segmentation in query images with a few annotated support samples.
We propose a progressively dual prior guided few-shot semantic segmentation network.
- Score: 57.37506990980975
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot semantic segmentation task aims at performing segmentation in query
images with a few annotated support samples. Currently, few-shot segmentation
methods mainly focus on leveraging foreground information without fully
utilizing the rich background information, which could result in wrong
activation of foreground-like background regions with the inadaptability to
dramatic scene changes of support-query image pairs. Meanwhile, the lack of
detail mining mechanism could cause coarse parsing results without some
semantic components or edge areas since prototypes have limited ability to cope
with large object appearance variance. To tackle these problems, we propose a
progressively dual prior guided few-shot semantic segmentation network.
Specifically, a dual prior mask generation (DPMG) module is firstly designed to
suppress the wrong activation in foreground-background comparison manner by
regarding background as assisted refinement information. With dual prior masks
refining the location of foreground area, we further propose a progressive
semantic detail enrichment (PSDE) module which forces the parsing model to
capture the hidden semantic details by iteratively erasing the high-confidence
foreground region and activating details in the rest region with a hierarchical
structure. The collaboration of DPMG and PSDE formulates a novel few-shot
segmentation network that can be learned in an end-to-end manner. Comprehensive
experiments on PASCAL-5i and MS COCO powerfully demonstrate that our proposed
algorithm achieves the great performance.
Related papers
- MROVSeg: Breaking the Resolution Curse of Vision-Language Models in Open-Vocabulary Semantic Segmentation [33.67313662538398]
We propose a multi-resolution training framework for open-vocabulary semantic segmentation with a single pretrained CLIP backbone.
MROVSeg uses sliding windows to slice the high-resolution input into uniform patches, each matching the input size of the well-trained image encoder.
We demonstrate the superiority of MROVSeg on well-established open-vocabulary semantic segmentation benchmarks.
arXiv Detail & Related papers (2024-08-27T04:45:53Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - Beyond the Prototype: Divide-and-conquer Proxies for Few-shot
Segmentation [63.910211095033596]
Few-shot segmentation aims to segment unseen-class objects given only a handful of densely labeled samples.
We propose a simple yet versatile framework in the spirit of divide-and-conquer.
Our proposed approach, named divide-and-conquer proxies (DCP), allows for the development of appropriate and reliable information.
arXiv Detail & Related papers (2022-04-21T06:21:14Z) - AF$_2$: Adaptive Focus Framework for Aerial Imagery Segmentation [86.44683367028914]
Aerial imagery segmentation has some unique challenges, the most critical one among which lies in foreground-background imbalance.
We propose Adaptive Focus Framework (AF$), which adopts a hierarchical segmentation procedure and focuses on adaptively utilizing multi-scale representations.
AF$ has significantly improved the accuracy on three widely used aerial benchmarks, as fast as the mainstream method.
arXiv Detail & Related papers (2022-02-18T10:14:45Z) - SCNet: Enhancing Few-Shot Semantic Segmentation by Self-Contrastive
Background Prototypes [56.387647750094466]
Few-shot semantic segmentation aims to segment novel-class objects in a query image with only a few annotated examples.
Most of advanced solutions exploit a metric learning framework that performs segmentation through matching each pixel to a learned foreground prototype.
This framework suffers from biased classification due to incomplete construction of sample pairs with the foreground prototype only.
arXiv Detail & Related papers (2021-04-19T11:21:47Z) - Self-Guided and Cross-Guided Learning for Few-Shot Segmentation [12.899804391102435]
We propose a self-guided learning approach for few-shot segmentation.
By making an initial prediction for the annotated support image, the covered and uncovered foreground regions are encoded to the primary and auxiliary support vectors.
By aggregating both primary and auxiliary support vectors, better segmentation performances are obtained on query images.
arXiv Detail & Related papers (2021-03-30T07:36:41Z) - SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from
Monocular images [94.36401543589523]
We introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks.
We then propose a Semantic Object and Depth Estimation Network (SOSD-Net) based on the objectness assumption.
To the best of our knowledge, SOSD-Net is the first network that exploits the geometry constraint for simultaneous monocular depth estimation and semantic segmentation.
arXiv Detail & Related papers (2021-01-19T02:41:03Z) - Unsupervised segmentation via semantic-apparent feature fusion [21.75371777263847]
This research proposes an unsupervised foreground segmentation method based on semantic-apparent feature fusion (SAFF)
Key regions of foreground object can be accurately responded via semantic features, while apparent features provide richer detailed expression.
By fusing semantic and apparent features, as well as cascading the modules of intra-image adaptive feature weight learning and inter-image common feature learning, the research achieves performance that significantly exceeds baselines.
arXiv Detail & Related papers (2020-05-21T08:28:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.