Prompt as Free Lunch: Enhancing Diversity in Source-Free Cross-domain Few-shot Learning through Semantic-Guided Prompting
- URL: http://arxiv.org/abs/2412.00767v1
- Date: Sun, 01 Dec 2024 11:00:38 GMT
- Title: Prompt as Free Lunch: Enhancing Diversity in Source-Free Cross-domain Few-shot Learning through Semantic-Guided Prompting
- Authors: Linhai Zhuo, Zheng Wang, Yuqian Fu, Tianwen Qian,
- Abstract summary: The source-free cross-domain few-shot learning task aims to transfer pretrained models to target domains utilizing minimal samples.
We propose the SeGD-VPT framework, which is divided into two phases.
The first step aims to increase feature diversity by adding diversity prompts to each support sample, thereby generating varying input and enhancing sample diversity.
- Score: 9.116108409344177
- License:
- Abstract: The source-free cross-domain few-shot learning (CD-FSL) task aims to transfer pretrained models to target domains utilizing minimal samples, eliminating the need for source domain data. Addressing this issue requires models to have robust generalization abilities and strong feature representation, aligning with the characteristics of large-scale pretrained models. However, large-scale models tend to lose representational ability in cross-domain scenarios due to limited sample diversity. \zlh{Given the abundant diversity provided by semantic modality, this paper leverages textual modality to enhance training sample diversity with CLP model}, meanwhile improving model transfer efficiency. Specifically, we propose the SeGD-VPT framework, which is divided into two phases. The first step aims to increase feature diversity by adding diversity prompts to each support sample, thereby generating varying input and enhancing sample diversity. Furthermore, we use diversity descriptions of classes to guide semantically meaningful learning of diversity prompts, proposing random combinations and selections of texts to increase textual diversity. Additionally, deep prompt tuning is introduced to enhance the model's transfer capability. After training of the first step, support samples with different diversity prompts are input into the CLIP backbone to generate enhanced features. After generation, the second phase trains classifiers using the generated features. Extensive experimental results across several benchmarks verify our method is comparable to SOTA source-utilized models and attain the best performance under the source-free CD-FSL setting.
Related papers
- DEUCE: Dual-diversity Enhancement and Uncertainty-awareness for Cold-start Active Learning [54.35107462768146]
Cold-start active learning (CSAL) selects valuable instances from an unlabeled dataset for manual annotation.
Existing CSAL methods overlook weak classes and hard representative examples, resulting in biased learning.
This paper proposes a novel dual-diversity enhancing and uncertainty-aware framework for CSAL.
arXiv Detail & Related papers (2025-02-01T04:00:03Z) - Synthesize, Partition, then Adapt: Eliciting Diverse Samples from Foundation Models [14.037826400805741]
We propose a novel framework, Synthesize-Partition-Adapt (SPA), that leverages the abundant synthetic data available in many domains to elicit diverse responses from foundation models.
By leveraging signal provided by data attribution methods such as influence functions, SPA partitions data into subsets, each targeting unique aspects of the data, and trains multiple model adaptations optimized for these subsets.
arXiv Detail & Related papers (2024-11-11T05:13:21Z) - Controlling the Fidelity and Diversity of Deep Generative Models via Pseudo Density [70.14884528360199]
We introduce an approach to bias deep generative models, such as GANs and diffusion models, towards generating data with enhanced fidelity or increased diversity.
Our approach involves manipulating the distribution of training and generated data through a novel metric for individual samples, named pseudo density.
arXiv Detail & Related papers (2024-07-11T16:46:04Z) - Building a Winning Team: Selecting Source Model Ensembles using a
Submodular Transferability Estimation Approach [20.86345962679122]
Estimating the transferability of publicly available pretrained models to a target task has assumed an important place for transfer learning tasks.
We propose a novel Optimal tranSport-based suBmOdular tRaNsferability metric (OSBORN) to estimate the transferability of an ensemble of models to a downstream task.
arXiv Detail & Related papers (2023-09-05T17:57:31Z) - Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations [61.132408427908175]
zero-shot GAN adaptation aims to reuse well-trained generators to synthesize images of an unseen target domain.
With only a single representative text feature instead of real images, the synthesized images gradually lose diversity.
We propose a novel method to find semantic variations of the target text in the CLIP space.
arXiv Detail & Related papers (2023-08-21T08:12:28Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - UniDiff: Advancing Vision-Language Models with Generative and
Discriminative Learning [86.91893533388628]
This paper presents UniDiff, a unified multi-modal model that integrates image-text contrastive learning (ITC), text-conditioned image synthesis learning (IS), and reciprocal semantic consistency modeling (RSC)
UniDiff demonstrates versatility in both multi-modal understanding and generative tasks.
arXiv Detail & Related papers (2023-06-01T15:39:38Z) - Diverse Text Generation via Variational Encoder-Decoder Models with
Gaussian Process Priors [21.71928935339393]
We present a novel latent structured variable model to generate high quality texts.
Specifically, we introduce a function to map deterministic encoder hidden states into random context variables.
To address the learning challenge of Gaussian processes, we propose an efficient variational inference approach.
arXiv Detail & Related papers (2022-04-04T04:09:15Z) - Multi-scale and Cross-scale Contrastive Learning for Semantic
Segmentation [5.281694565226513]
We apply contrastive learning to enhance the discriminative power of the multi-scale features extracted by semantic segmentation networks.
By first mapping the encoder's multi-scale representations to a common feature space, we instantiate a novel form of supervised local-global constraint.
arXiv Detail & Related papers (2022-03-25T01:24:24Z) - Exploring Complementary Strengths of Invariant and Equivariant
Representations for Few-Shot Learning [96.75889543560497]
In many real-world problems, collecting a large number of labeled samples is infeasible.
Few-shot learning is the dominant approach to address this issue, where the objective is to quickly adapt to novel categories in presence of a limited number of samples.
We propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations.
arXiv Detail & Related papers (2021-03-01T21:14:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.