Related papers: APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation

APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation

URL: http://arxiv.org/abs/2406.08372v2
Date: Thu, 13 Jun 2024 03:10:17 GMT
Title: APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation
Authors: Weizhao He, Yang Zhang, Wei Zhuo, Linlin Shen, Jiaqi Yang, Songhe Deng, Liang Sun,
Abstract summary: We introduce APSeg, a novel auto-prompt network for cross-domain few-shot semantic segmentation (CD-FSS) Our model outperforms the state-of-the-art CD-FSS method by 5.24% and 3.10% in average accuracy on 1-shot and 5-shot settings, respectively.
Score: 33.90244697752314
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Few-shot semantic segmentation (FSS) endeavors to segment unseen classes with only a few labeled samples. Current FSS methods are commonly built on the assumption that their training and application scenarios share similar domains, and their performances degrade significantly while applied to a distinct domain. To this end, we propose to leverage the cutting-edge foundation model, the Segment Anything Model (SAM), for generalization enhancement. The SAM however performs unsatisfactorily on domains that are distinct from its training data, which primarily comprise natural scene images, and it does not support automatic segmentation of specific semantics due to its interactive prompting mechanism. In our work, we introduce APSeg, a novel auto-prompt network for cross-domain few-shot semantic segmentation (CD-FSS), which is designed to be auto-prompted for guiding cross-domain segmentation. Specifically, we propose a Dual Prototype Anchor Transformation (DPAT) module that fuses pseudo query prototypes extracted based on cycle-consistency with support prototypes, allowing features to be transformed into a more stable domain-agnostic space. Additionally, a Meta Prompt Generator (MPG) module is introduced to automatically generate prompt embeddings, eliminating the need for manual visual prompts. We build an efficient model which can be applied directly to target domains without fine-tuning. Extensive experiments on four cross-domain datasets show that our model outperforms the state-of-the-art CD-FSS method by 5.24% and 3.10% in average accuracy on 1-shot and 5-shot settings, respectively.

Related papers

Textual and Visual Guided Task Adaptation for Source-Free Cross-Domain Few-Shot Segmentation [0.979247551980983]
Few-Shot(FSS) aims to efficient segmentation of new objects with few labeled samples.<n>Cross-Domain Few-Shot(CD-FSS) is proposed to mitigate such performance degradation.
arXiv Detail & Related papers (2025-08-07T09:48:24Z)
CMP: A Composable Meta Prompt for SAM-Based Cross-Domain Few-Shot Segmentation [20.489756120720568]
Cross-Domain Few-Shots (CD-FSS) remains challenging due to limited data and domain shifts.<n>Recent foundation models like the Segment Anything Model (SAM) have shown remarkable zero-shot generalization capability in general segmentation tasks.<n>We propose the Composable Meta-Prompt framework that introduces three key modules: (i) the Reference Complement and Transformation (RCT) module for semantic expansion, (ii) the Composable Meta-Prompt Generation (CMPG) module for automated meta-prompt synthesis, and (iii) the Frequency-Aware Interaction (FAI) module for domain discrepancy mitigation.
arXiv Detail & Related papers (2025-07-22T16:42:23Z)
Leveraging Segment Anything Model for Source-Free Domain Adaptation via Dual Feature Guided Auto-Prompting [14.640695608089796]
Source-free domain adaptation (SFDA) for segmentation aims at adapting a model trained in the source domain to perform well in the target domain.<n>We for the first time explore the potentials of Segment Anything Model for SFDA via automatedly finding an accurate bounding box prompt.<n>We propose a novel Dual Feature Guided (DFG) auto-prompting approach to search for the box prompt.
arXiv Detail & Related papers (2025-05-13T13:00:48Z)
Adapting In-Domain Few-Shot Segmentation to New Domains without Retraining [53.963279865355105]
Cross-domain few-shot segmentation (CD-FSS) aims to segment objects of novel classes in new domains. Most CD-FSS methods redesign and retrain in-domain FSS models using various domain-generalization techniques. We propose adapting informative model structures of the well-trained FSS model for target domains by learning domain characteristics from few-shot labeled support samples.
arXiv Detail & Related papers (2025-04-30T08:16:33Z)
TAVP: Task-Adaptive Visual Prompt for Cross-domain Few-shot Segmentation [44.134340976905655]
This work proposes a task-adaptive prompt framework based on the Segment Anything Model (SAM) It uses a unique generative approach to prompts alongside a comprehensive model structure and specialized prototype computation. After task-specific and weighted guidance, the abundant feature information of SAM can be better learned for Cross-dominan few-shot segmentation.
arXiv Detail & Related papers (2024-09-09T07:43:58Z)
Cross-Domain Few-Shot Semantic Segmentation via Doubly Matching Transformation [26.788260801305974]
Cross-Domain Few-shot Semantic (CD-FSS) aims to train generalized models that can segment classes from different domains with a few labeled images. Previous works have proven the effectiveness of feature transformation in addressing CD-FSS. We propose a Doubly Matching Transformation-based Network (DMTNet) to solve the above issue.
arXiv Detail & Related papers (2024-05-24T06:47:43Z)
Domain-Rectifying Adapter for Cross-Domain Few-Shot Segmentation [40.667166043101076]
We propose a small adapter for rectifying diverse target domain styles to the source domain. The adapter is trained to rectify the image features from diverse synthesized target domains to align with the source domain. Our method achieves promising results on cross-domain few-shot semantic segmentation tasks.
arXiv Detail & Related papers (2024-04-16T07:07:40Z)
RestNet: Boosting Cross-Domain Few-Shot Segmentation with Residual Transformation Network [4.232614032390374]
Cross-domain few-shot segmentation (CD-FSS) aims to achieve semantic segmentation in previously unseen domains with a limited number of annotated samples. We propose a novel residual transformation network (RestNet) that facilitates knowledge transfer while retaining the intra-domain support-Query feature information.
arXiv Detail & Related papers (2023-08-25T16:13:22Z)
Cross-domain Few-shot Segmentation with Transductive Fine-tuning [29.81009103722184]
We propose to transductively fine-tune the base model on a set of query images under the few-shot setting. Our method could consistently and significantly improve the performance of prototypical FSS models in all cross-domain tasks.
arXiv Detail & Related papers (2022-11-27T06:44:41Z)
UniDAformer: Unified Domain Adaptive Panoptic Segmentation Transformer via Hierarchical Mask Calibration [49.16591283724376]
We design UniDAformer, a unified domain adaptive panoptic segmentation transformer that is simple but can achieve domain adaptive instance segmentation and semantic segmentation simultaneously within a single network. UniDAformer introduces Hierarchical Mask (HMC) that rectifies inaccurate predictions at the level of regions, superpixels and annotated pixels via online self-training on the fly. It has three unique features: 1) it enables unified domain adaptive panoptic adaptation; 2) it mitigates false predictions and improves domain adaptive panoptic segmentation effectively; 3) it is end-to-end trainable with a much simpler training and inference pipeline.
arXiv Detail & Related papers (2022-06-30T07:32:23Z)
Semantic-Aware Domain Generalized Segmentation [67.49163582961877]
Deep models trained on source domain lack generalization when evaluated on unseen target domains with different data distributions. We propose a framework including two novel modules: Semantic-Aware Normalization (SAN) and Semantic-Aware Whitening (SAW) Our approach shows significant improvements over existing state-of-the-art on various backbone networks.
arXiv Detail & Related papers (2022-04-02T09:09:59Z)
Stagewise Unsupervised Domain Adaptation with Adversarial Self-Training for Road Segmentation of Remote Sensing Images [93.50240389540252]
Road segmentation from remote sensing images is a challenging task with wide ranges of application potentials. We propose a novel stagewise domain adaptation model called RoadDA to address the domain shift (DS) issue in this field. Experiment results on two benchmarks demonstrate that RoadDA can efficiently reduce the domain gap and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-08-28T09:29:14Z)
Boosting Few-shot Semantic Segmentation with Transformers [81.43459055197435]
TRansformer-based Few-shot Semantic segmentation method (TRFS) Our model consists of two modules: Global Enhancement Module (GEM) and Local Enhancement Module (LEM)
arXiv Detail & Related papers (2021-08-04T20:09:21Z)
Cross-domain Contrastive Learning for Unsupervised Domain Adaptation [108.63914324182984]
Unsupervised domain adaptation (UDA) aims to transfer knowledge learned from a fully-labeled source domain to a different unlabeled target domain. We build upon contrastive self-supervised learning to align features so as to reduce the domain discrepancy between training and testing sets.
arXiv Detail & Related papers (2021-06-10T06:32:30Z)
Prototypical Cross-domain Self-supervised Learning for Few-shot Unsupervised Domain Adaptation [91.58443042554903]
We propose an end-to-end Prototypical Cross-domain Self-Supervised Learning (PCS) framework for Few-shot Unsupervised Domain Adaptation (FUDA) PCS not only performs cross-domain low-level feature alignment, but it also encodes and aligns semantic structures in the shared embedding space across domains. Compared with state-of-the-art methods, PCS improves the mean classification accuracy over different domain pairs on FUDA by 10.5%, 3.5%, 9.0%, and 13.2% on Office, Office-Home, VisDA-2017, and DomainNet, respectively.
arXiv Detail & Related papers (2021-03-31T02:07:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.