Related papers: CMP: A Composable Meta Prompt for SAM-Based Cross-Domain Few-Shot Segmentation

CMP: A Composable Meta Prompt for SAM-Based Cross-Domain Few-Shot Segmentation

URL: http://arxiv.org/abs/2507.16753v1
Date: Tue, 22 Jul 2025 16:42:23 GMT
Title: CMP: A Composable Meta Prompt for SAM-Based Cross-Domain Few-Shot Segmentation
Authors: Shuai Chen, Fanman Meng, Chunjin Yang, Haoran Wei, Chenhao Wu, Qingbo Wu, Hongliang Li,
Abstract summary: Cross-Domain Few-Shots (CD-FSS) remains challenging due to limited data and domain shifts.<n>Recent foundation models like the Segment Anything Model (SAM) have shown remarkable zero-shot generalization capability in general segmentation tasks.<n>We propose the Composable Meta-Prompt framework that introduces three key modules: (i) the Reference Complement and Transformation (RCT) module for semantic expansion, (ii) the Composable Meta-Prompt Generation (CMPG) module for automated meta-prompt synthesis, and (iii) the Frequency-Aware Interaction (FAI) module for domain discrepancy mitigation.
Score: 20.489756120720568
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Cross-Domain Few-Shot Segmentation (CD-FSS) remains challenging due to limited data and domain shifts. Recent foundation models like the Segment Anything Model (SAM) have shown remarkable zero-shot generalization capability in general segmentation tasks, making it a promising solution for few-shot scenarios. However, adapting SAM to CD-FSS faces two critical challenges: reliance on manual prompt and limited cross-domain ability. Therefore, we propose the Composable Meta-Prompt (CMP) framework that introduces three key modules: (i) the Reference Complement and Transformation (RCT) module for semantic expansion, (ii) the Composable Meta-Prompt Generation (CMPG) module for automated meta-prompt synthesis, and (iii) the Frequency-Aware Interaction (FAI) module for domain discrepancy mitigation. Evaluations across four cross-domain datasets demonstrate CMP's state-of-the-art performance, achieving 71.8\% and 74.5\% mIoU in 1-shot and 5-shot scenarios respectively.

Related papers

CMaP-SAM: Contraction Mapping Prior for SAM-driven Few-shot Segmentation [21.466035540502226]
Few-shot segmentation (FSS) aims to segment new classes using few annotated images.<n>Recent FSS methods have shown considerable improvements by leveraging Segment Anything Model (SAM)<n>We propose CMaP-SAM, a novel framework that introduces contraction mapping theory to optimize position priors for SAM-driven FSS.
arXiv Detail & Related papers (2025-04-07T13:19:16Z)
Let Synthetic Data Shine: Domain Reassembly and Soft-Fusion for Single Domain Generalization [68.41367635546183]
Single Domain Generalization aims to train models with consistent performance across diverse scenarios using data from a single source.<n>We propose Discriminative Domain Reassembly and Soft-Fusion (DRSF), a training framework leveraging synthetic data to improve model generalization.
arXiv Detail & Related papers (2025-03-17T18:08:03Z)
FAMNet: Frequency-aware Matching Network for Cross-domain Few-shot Medical Image Segmentation [15.066227784509303]
Existing few-shot medical image segmentation (FSMIS) models fail to address a practical issue in medical imaging: the domain shift caused by different imaging techniques.<n>We propose a Frequency-aware Matching Network (FAMNet), which includes two key components: a Frequency-aware Matching (FAM) module and a Multi-Spectral Fusion (MSF) module.<n>Our FAMNet surpasses existing FSMIS models and Cross-domain Few-shot Semantic models on three cross-domain datasets.
arXiv Detail & Related papers (2024-12-12T14:44:05Z)
TAVP: Task-Adaptive Visual Prompt for Cross-domain Few-shot Segmentation [40.49924427388922]
We propose a task-adaptive auto-visual prompt framework for Cross-dominan Few-shot segmentation (CD-FSS)<n>We incorporate a Class Domain Task-Adaptive Auto-Prompt (CDTAP) module to enable class-domain feature extraction and generate high-quality, learnable visual prompts.<n>Our model outperforms the state-of-the-art CD-FSS approach, achieving an average accuracy improvement of 1.3% in the 1-shot setting and 11.76% in the 5-shot setting.
arXiv Detail & Related papers (2024-09-09T07:43:58Z)
APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation [33.90244697752314]
We introduce APSeg, a novel auto-prompt network for cross-domain few-shot semantic segmentation (CD-FSS) Our model outperforms the state-of-the-art CD-FSS method by 5.24% and 3.10% in average accuracy on 1-shot and 5-shot settings, respectively.
arXiv Detail & Related papers (2024-06-12T16:20:58Z)
Cross-Domain Few-Shot Segmentation via Iterative Support-Query Correspondence Mining [81.09446228688559]
Cross-Domain Few-Shots (CD-FSS) poses the challenge of segmenting novel categories from a distinct domain using only limited exemplars. We propose a novel cross-domain fine-tuning strategy that addresses the challenging CD-FSS tasks.
arXiv Detail & Related papers (2024-01-16T14:45:41Z)
Collaborating Foundation Models for Domain Generalized Semantic Segmentation [23.359941294938142]
Domain Generalized Semantic (DGSS) deals with training a model on a labeled source domain. We take an approach to DGSS and propose to use an assembly of CoLlaborative FOUndation models for Domain Generalized Semantic (CLOUDS)
arXiv Detail & Related papers (2023-12-15T13:43:24Z)
USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality. Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z)
CRCNet: Few-shot Segmentation with Cross-Reference and Region-Global Conditional Networks [59.85183776573642]
Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images. We propose a Cross-Reference and Local-Global Networks (CRCNet) for few-shot segmentation. Our network can better find the co-occurrent objects in the two images with a cross-reference mechanism.
arXiv Detail & Related papers (2022-08-23T06:46:18Z)
Boosting Few-shot Semantic Segmentation with Transformers [81.43459055197435]
TRansformer-based Few-shot Semantic segmentation method (TRFS) Our model consists of two modules: Global Enhancement Module (GEM) and Local Enhancement Module (LEM)
arXiv Detail & Related papers (2021-08-04T20:09:21Z)
Memorizing Comprehensively to Learn Adaptively: Unsupervised Cross-Domain Person Re-ID with Multi-level Memory [89.43986007948772]
We propose a novel multi-level memory network (MMN) to discover multi-level complementary information in the target domain. Unlike the simple memory in previous works, we propose a novel multi-level memory network (MMN) to discover multi-level complementary information in the target domain.
arXiv Detail & Related papers (2020-01-13T09:48:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.