Diversity-Aware Meta Visual Prompting
- URL: http://arxiv.org/abs/2303.08138v1
- Date: Tue, 14 Mar 2023 17:59:59 GMT
- Title: Diversity-Aware Meta Visual Prompting
- Authors: Qidong Huang and Xiaoyi Dong and Dongdong Chen and Weiming Zhang and
Feifei Wang and Gang Hua and Nenghai Yu
- Abstract summary: We present Diversity-Aware Meta Visual Prompting(DAM-VP), an efficient prompting method for transferring pre-trained models to downstream tasks with frozen backbone.
We cluster the downstream dataset into small subsets in a diversity-strapped way, with each subset has its own prompt separately.
All the prompts are optimized with a meta-prompt, which is learned across several datasets.
- Score: 111.75306320834629
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Diversity-Aware Meta Visual Prompting~(DAM-VP), an efficient and
effective prompting method for transferring pre-trained models to downstream
tasks with frozen backbone. A challenging issue in visual prompting is that
image datasets sometimes have a large data diversity whereas a per-dataset
generic prompt can hardly handle the complex distribution shift toward the
original pretraining data distribution properly. To address this issue, we
propose a dataset Diversity-Aware prompting strategy whose initialization is
realized by a Meta-prompt. Specifically, we cluster the downstream dataset into
small homogeneity subsets in a diversity-adaptive way, with each subset has its
own prompt optimized separately. Such a divide-and-conquer design reduces the
optimization difficulty greatly and significantly boosts the prompting
performance. Furthermore, all the prompts are initialized with a meta-prompt,
which is learned across several datasets. It is a bootstrapped paradigm, with
the key observation that the prompting knowledge learned from previous datasets
could help the prompt to converge faster and perform better on a new dataset.
During inference, we dynamically select a proper prompt for each input, based
on the feature distance between the input and each subset. Through extensive
experiments, our DAM-VP demonstrates superior efficiency and effectiveness,
clearly surpassing previous prompting methods in a series of downstream
datasets for different pretraining models. Our code is available at:
\url{https://github.com/shikiw/DAM-VP}.
Related papers
- Contrastive Transformer Learning with Proximity Data Generation for
Text-Based Person Search [60.626459715780605]
Given a descriptive text query, text-based person search aims to retrieve the best-matched target person from an image gallery.
Such a cross-modal retrieval task is quite challenging due to significant modality gap, fine-grained differences and insufficiency of annotated data.
In this paper, we propose a simple yet effective dual Transformer model for text-based person search.
arXiv Detail & Related papers (2023-11-15T16:26:49Z) - Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning [47.02160072880698]
We introduce a self-evolving mechanism that allows the model itself to actively sample subsets that are equally or even more effective.
The key to our data sampling technique lies in the enhancement of diversity in the chosen subsets.
Extensive experiments across three datasets and benchmarks demonstrate the effectiveness of DiverseEvol.
arXiv Detail & Related papers (2023-11-14T14:10:40Z) - Distribution-Aware Prompt Tuning for Vision-Language Models [20.02599087680773]
A key to prompt tuning is the feature space alignment between two modalities via learnable vectors with model parameters fixed.
Inspired by this observation, we proposed distribution-aware prompt tuning (DAPT) for vision-language models.
Our experiments on 11 benchmark datasets demonstrate that our method significantly improves generalizability.
arXiv Detail & Related papers (2023-09-06T23:49:11Z) - Large Language Model as Attributed Training Data Generator: A Tale of
Diversity and Bias [92.41919689753051]
Large language models (LLMs) have been recently leveraged as training data generators for various natural language processing (NLP) tasks.
We investigate training data generation with diversely attributed prompts, which have the potential to yield diverse and attributed generated data.
We show that attributed prompts outperform simple class-conditional prompts in terms of the resulting model's performance.
arXiv Detail & Related papers (2023-06-28T03:31:31Z) - Boosted Prompt Ensembles for Large Language Models [38.402161594793775]
Methods such as chain-of-thought prompting and self-consistency have pushed the frontier of language model reasoning performance with no additional training.
We propose a prompt ensembling method for large language models, which uses a small dataset to construct a set of few shot prompts that together comprise a boosted prompt ensemble''
We show that this outperforms single-prompt output-space ensembles and bagged prompt-space ensembles on the GSM8k and AQuA datasets.
arXiv Detail & Related papers (2023-04-12T16:47:15Z) - M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios [103.6153593636399]
We propose a vision-language prompt tuning method with mitigated label bias (M-Tuning)
It introduces open words from the WordNet to extend the range of words forming the prompt texts from only closed-set label words to more, and thus prompts are tuned in a simulated open-set scenario.
Our method achieves the best performance on datasets with various scales, and extensive ablation studies also validate its effectiveness.
arXiv Detail & Related papers (2023-03-09T09:05:47Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.