APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation
- URL: http://arxiv.org/abs/2406.08372v2
- Date: Thu, 13 Jun 2024 03:10:17 GMT
- Title: APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation
- Authors: Weizhao He, Yang Zhang, Wei Zhuo, Linlin Shen, Jiaqi Yang, Songhe Deng, Liang Sun,
- Abstract summary: We introduce APSeg, a novel auto-prompt network for cross-domain few-shot semantic segmentation (CD-FSS)
Our model outperforms the state-of-the-art CD-FSS method by 5.24% and 3.10% in average accuracy on 1-shot and 5-shot settings, respectively.
- Score: 33.90244697752314
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot semantic segmentation (FSS) endeavors to segment unseen classes with only a few labeled samples. Current FSS methods are commonly built on the assumption that their training and application scenarios share similar domains, and their performances degrade significantly while applied to a distinct domain. To this end, we propose to leverage the cutting-edge foundation model, the Segment Anything Model (SAM), for generalization enhancement. The SAM however performs unsatisfactorily on domains that are distinct from its training data, which primarily comprise natural scene images, and it does not support automatic segmentation of specific semantics due to its interactive prompting mechanism. In our work, we introduce APSeg, a novel auto-prompt network for cross-domain few-shot semantic segmentation (CD-FSS), which is designed to be auto-prompted for guiding cross-domain segmentation. Specifically, we propose a Dual Prototype Anchor Transformation (DPAT) module that fuses pseudo query prototypes extracted based on cycle-consistency with support prototypes, allowing features to be transformed into a more stable domain-agnostic space. Additionally, a Meta Prompt Generator (MPG) module is introduced to automatically generate prompt embeddings, eliminating the need for manual visual prompts. We build an efficient model which can be applied directly to target domains without fine-tuning. Extensive experiments on four cross-domain datasets show that our model outperforms the state-of-the-art CD-FSS method by 5.24% and 3.10% in average accuracy on 1-shot and 5-shot settings, respectively.
Related papers
- TAVP: Task-Adaptive Visual Prompt for Cross-domain Few-shot Segmentation [44.134340976905655]
This work proposes a task-adaptive prompt framework based on the Segment Anything Model (SAM)
It uses a unique generative approach to prompts alongside a comprehensive model structure and specialized prototype computation.
After task-specific and weighted guidance, the abundant feature information of SAM can be better learned for Cross-dominan few-shot segmentation.
arXiv Detail & Related papers (2024-09-09T07:43:58Z) - Cross-Domain Few-Shot Semantic Segmentation via Doubly Matching Transformation [26.788260801305974]
Cross-Domain Few-shot Semantic (CD-FSS) aims to train generalized models that can segment classes from different domains with a few labeled images.
Previous works have proven the effectiveness of feature transformation in addressing CD-FSS.
We propose a Doubly Matching Transformation-based Network (DMTNet) to solve the above issue.
arXiv Detail & Related papers (2024-05-24T06:47:43Z) - Domain-Rectifying Adapter for Cross-Domain Few-Shot Segmentation [40.667166043101076]
We propose a small adapter for rectifying diverse target domain styles to the source domain.
The adapter is trained to rectify the image features from diverse synthesized target domains to align with the source domain.
Our method achieves promising results on cross-domain few-shot semantic segmentation tasks.
arXiv Detail & Related papers (2024-04-16T07:07:40Z) - RestNet: Boosting Cross-Domain Few-Shot Segmentation with Residual
Transformation Network [4.232614032390374]
Cross-domain few-shot segmentation (CD-FSS) aims to achieve semantic segmentation in previously unseen domains with a limited number of annotated samples.
We propose a novel residual transformation network (RestNet) that facilitates knowledge transfer while retaining the intra-domain support-Query feature information.
arXiv Detail & Related papers (2023-08-25T16:13:22Z) - Cross-domain Few-shot Segmentation with Transductive Fine-tuning [29.81009103722184]
We propose to transductively fine-tune the base model on a set of query images under the few-shot setting.
Our method could consistently and significantly improve the performance of prototypical FSS models in all cross-domain tasks.
arXiv Detail & Related papers (2022-11-27T06:44:41Z) - UniDAformer: Unified Domain Adaptive Panoptic Segmentation Transformer
via Hierarchical Mask Calibration [49.16591283724376]
We design UniDAformer, a unified domain adaptive panoptic segmentation transformer that is simple but can achieve domain adaptive instance segmentation and semantic segmentation simultaneously within a single network.
UniDAformer introduces Hierarchical Mask (HMC) that rectifies inaccurate predictions at the level of regions, superpixels and annotated pixels via online self-training on the fly.
It has three unique features: 1) it enables unified domain adaptive panoptic adaptation; 2) it mitigates false predictions and improves domain adaptive panoptic segmentation effectively; 3) it is end-to-end trainable with a much simpler training and inference pipeline.
arXiv Detail & Related papers (2022-06-30T07:32:23Z) - Semantic-Aware Domain Generalized Segmentation [67.49163582961877]
Deep models trained on source domain lack generalization when evaluated on unseen target domains with different data distributions.
We propose a framework including two novel modules: Semantic-Aware Normalization (SAN) and Semantic-Aware Whitening (SAW)
Our approach shows significant improvements over existing state-of-the-art on various backbone networks.
arXiv Detail & Related papers (2022-04-02T09:09:59Z) - Stagewise Unsupervised Domain Adaptation with Adversarial Self-Training
for Road Segmentation of Remote Sensing Images [93.50240389540252]
Road segmentation from remote sensing images is a challenging task with wide ranges of application potentials.
We propose a novel stagewise domain adaptation model called RoadDA to address the domain shift (DS) issue in this field.
Experiment results on two benchmarks demonstrate that RoadDA can efficiently reduce the domain gap and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-08-28T09:29:14Z) - Boosting Few-shot Semantic Segmentation with Transformers [81.43459055197435]
TRansformer-based Few-shot Semantic segmentation method (TRFS)
Our model consists of two modules: Global Enhancement Module (GEM) and Local Enhancement Module (LEM)
arXiv Detail & Related papers (2021-08-04T20:09:21Z) - Cross-domain Contrastive Learning for Unsupervised Domain Adaptation [108.63914324182984]
Unsupervised domain adaptation (UDA) aims to transfer knowledge learned from a fully-labeled source domain to a different unlabeled target domain.
We build upon contrastive self-supervised learning to align features so as to reduce the domain discrepancy between training and testing sets.
arXiv Detail & Related papers (2021-06-10T06:32:30Z) - Prototypical Cross-domain Self-supervised Learning for Few-shot
Unsupervised Domain Adaptation [91.58443042554903]
We propose an end-to-end Prototypical Cross-domain Self-Supervised Learning (PCS) framework for Few-shot Unsupervised Domain Adaptation (FUDA)
PCS not only performs cross-domain low-level feature alignment, but it also encodes and aligns semantic structures in the shared embedding space across domains.
Compared with state-of-the-art methods, PCS improves the mean classification accuracy over different domain pairs on FUDA by 10.5%, 3.5%, 9.0%, and 13.2% on Office, Office-Home, VisDA-2017, and DomainNet, respectively.
arXiv Detail & Related papers (2021-03-31T02:07:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.