Related papers: Multiple Stochastic Prompt Tuning for Practical Cross-Domain Few Shot Learning

Multiple Stochastic Prompt Tuning for Practical Cross-Domain Few Shot Learning

URL: http://arxiv.org/abs/2506.03926v1
Date: Wed, 04 Jun 2025 13:18:04 GMT
Title: Multiple Stochastic Prompt Tuning for Practical Cross-Domain Few Shot Learning
Authors: Debarshi Brahma, Soma Biswas,
Abstract summary: We propose a cross-domain few-shot learning task, where a large-scale pre-trained model like CLIP can be easily deployed on a target dataset.<n>The goal is to simultaneously classify all unseen classes under extreme domain shifts, by utilizing only a few labeled samples per class.<n>We propose a novel framework, termed MIST (MultIple STochastic Prompt tuning), where multiple prompts are utilized to handle significant domain and semantic shifts.
Score: 14.85375816073596
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we propose a practical cross-domain few-shot learning (pCDFSL) task, where a large-scale pre-trained model like CLIP can be easily deployed on a target dataset. The goal is to simultaneously classify all unseen classes under extreme domain shifts, by utilizing only a few labeled samples per class. The pCDFSL paradigm is source-free and moves beyond artificially created episodic training and testing regimes followed by existing CDFSL frameworks, making it more challenging and relevant to real-world applications. Towards that goal, we propose a novel framework, termed MIST (MultIple STochastic Prompt tuning), where multiple stochastic prompts are utilized to handle significant domain and semantic shifts. Specifically, multiple prompts are learnt for each class, effectively capturing multiple peaks in the input data. Furthermore, instead of representing the weights of the multiple prompts as point-estimates, we model them as learnable Gaussian distributions with two different strategies, encouraging an efficient exploration of the prompt parameter space, which mitigate overfitting due to the few labeled training samples. Extensive experiments and comparison with the state-of-the-art methods on four CDFSL benchmarks adapted to this setting, show the effectiveness of the proposed framework.

Related papers

Multi-Prompt Progressive Alignment for Multi-Source Unsupervised Domain Adaptation [73.40696661117408]
We propose a progressive alignment strategy for adapting CLIP to unlabeled downstream task.<n>We name our approach MP2A and test it on three popular UDA benchmarks, namely ImageCLEF, Office-Home, and the most challenging DomainNet.<n> Experiments showcase that MP2A achieves state-of-the-art performance when compared with most recent CLIP-based MS-UDA approaches.
arXiv Detail & Related papers (2025-07-31T09:42:42Z)
Cross-Domain Few-Shot Learning with Coalescent Projections and Latent Space Reservation [6.178597284949811]
Coalescent Projection (CP) is an effective successor to soft prompts.<n>Self-Supervised Transformations (SSTs) are proposed to prepare the network for encountering unseen samples from different domains.
arXiv Detail & Related papers (2025-07-21T05:01:27Z)
Prompt Tuning Vision Language Models with Margin Regularizer for Few-Shot Learning under Distribution Shifts [13.21626568246313]
We analyze whether vision-language foundation models can be adapted to target datasets with very different distributions and classes.<n>We propose a novel prompt-tuning method, PromptMargin, for adapting such large-scale VLMs directly on the few target samples.<n>PromptMargin effectively tunes the text as well as visual prompts for this task, and has two main modules.
arXiv Detail & Related papers (2025-05-21T13:26:56Z)
LC-Protonets: Multi-Label Few-Shot Learning for World Music Audio Tagging [65.72891334156706]
We introduce Label-Combination Prototypical Networks (LC-Protonets) to address the problem of multi-label few-shot classification.<n> LC-Protonets generate one prototype per label combination, derived from the power set of labels present in the limited training items.<n>Our method is applied to automatic audio tagging across diverse music datasets, covering various cultures and including both modern and traditional music.
arXiv Detail & Related papers (2024-09-17T15:13:07Z)
Learning New Tasks from a Few Examples with Soft-Label Prototypes [18.363177410917597]
We propose a novel few-shot learning approach based on soft-label prototypes (SLPs) We focus on learning previously unseen NLP tasks from very few examples (4, 8, 16) per class. We experimentally demonstrate that our approach achieves superior performance on the majority of tested tasks in this data-lean setting.
arXiv Detail & Related papers (2022-10-31T16:06:48Z)
Few-shot Learning via Dependency Maximization and Instance Discriminant Analysis [21.8311401851523]
We study the few-shot learning problem, where a model learns to recognize new objects with extremely few labeled data per category. We propose a simple approach to exploit unlabeled data accompanying the few-shot task for improving few-shot performance.
arXiv Detail & Related papers (2021-09-07T02:19:01Z)
Real-Time Visual Object Tracking via Few-Shot Learning [107.39695680340877]
Visual Object Tracking (VOT) can be seen as an extended task of Few-Shot Learning (FSL) We propose a two-stage framework that is capable of employing a large variety of FSL algorithms while presenting faster adaptation speed. Experiments on the major benchmarks, VOT2018, OTB2015, NFS, UAV123, TrackingNet, and GOT-10k are conducted, demonstrating a desirable performance gain and a real-time speed.
arXiv Detail & Related papers (2021-03-18T10:02:03Z)
Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning [96.75889543560497]
In many real-world problems, collecting a large number of labeled samples is infeasible. Few-shot learning is the dominant approach to address this issue, where the objective is to quickly adapt to novel categories in presence of a limited number of samples. We propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations.
arXiv Detail & Related papers (2021-03-01T21:14:33Z)
Contrastive Prototype Learning with Augmented Embeddings for Few-Shot Learning [58.2091760793799]
We propose a novel contrastive prototype learning with augmented embeddings (CPLAE) model. With a class prototype as an anchor, CPL aims to pull the query samples of the same class closer and those of different classes further away. Extensive experiments on several benchmarks demonstrate that our proposed CPLAE achieves new state-of-the-art.
arXiv Detail & Related papers (2021-01-23T13:22:44Z)
Multi-Scale Positive Sample Refinement for Few-Shot Object Detection [61.60255654558682]
Few-shot object detection (FSOD) helps detectors adapt to unseen classes with few training instances. We propose a Multi-scale Positive Sample Refinement (MPSR) approach to enrich object scales in FSOD. MPSR generates multi-scale positive samples as object pyramids and refines the prediction at various scales.
arXiv Detail & Related papers (2020-07-18T09:48:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.