Neural Prompt Search
- URL: http://arxiv.org/abs/2206.04673v1
- Date: Thu, 9 Jun 2022 17:59:58 GMT
- Title: Neural Prompt Search
- Authors: Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu
- Abstract summary: We propose Neural prOmpt seArcH, a novel approach that learns, for large vision models, the optimal design of prompt modules.
NOAH learns, for large vision models, the optimal design of prompt modules through a neural architecture search algorithm.
- Score: 38.68910532066619
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The size of vision models has grown exponentially over the last few years,
especially after the emergence of Vision Transformer. This has motivated the
development of parameter-efficient tuning methods, such as learning adapter
layers or visual prompt tokens, which allow a tiny portion of model parameters
to be trained whereas the vast majority obtained from pre-training are frozen.
However, designing a proper tuning method is non-trivial: one might need to try
out a lengthy list of design choices, not to mention that each downstream
dataset often requires custom designs. In this paper, we view the existing
parameter-efficient tuning methods as "prompt modules" and propose Neural
prOmpt seArcH (NOAH), a novel approach that learns, for large vision models,
the optimal design of prompt modules through a neural architecture search
algorithm, specifically for each downstream dataset. By conducting extensive
experiments on over 20 vision datasets, we demonstrate that NOAH (i) is
superior to individual prompt modules, (ii) has a good few-shot learning
ability, and (iii) is domain-generalizable. The code and models are available
at https://github.com/Davidzhangyuanhan/NOAH.
Related papers
- On the Steganographic Capacity of Selected Learning Models [1.0640226829362012]
We consider the question of the steganographic capacity of learning models.
For a wide range of models, we determine the number of low-order bits that can be overwritten.
Of the models tested, the steganographic capacity ranges from 7.04 KB for our LR experiments, to 44.74 MB for InceptionV3.
arXiv Detail & Related papers (2023-08-29T10:41:34Z) - Visual Tuning [143.43997336384126]
Fine-tuning visual models has been widely shown promising performance on many downstream visual tasks.
Recent advances can achieve superior performance than full-tuning the whole pre-trained parameters.
This survey characterizes a large and thoughtful selection of recent works, providing a systematic and comprehensive overview of work and models.
arXiv Detail & Related papers (2023-05-10T11:26:36Z) - NAR-Former: Neural Architecture Representation Learning towards Holistic
Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically.
Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z) - Unified Vision and Language Prompt Learning [86.1530128487077]
We present a systematic study on two representative prompt tuning methods, namely text prompt tuning and visual prompt tuning.
A major finding is that text prompt tuning fails on data with high intra-class visual variances while visual prompt tuning cannot handle low inter-class variances.
To combine the best from both worlds, we propose a simple approach called Unified Prompt Tuning (UPT), which essentially learns a tiny neural network to jointly optimize prompts across different modalities.
arXiv Detail & Related papers (2022-10-13T17:50:24Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Pro-tuning: Unified Prompt Tuning for Vision Tasks [133.12978197265596]
Fine-tuning is the de-facto approach to leverage pre-trained vision models to perform downstream tasks.
In this work, we propose parameter-efficient Prompt tuning (Pro-tuning) to adapt frozen vision models to various downstream vision tasks.
arXiv Detail & Related papers (2022-07-28T21:09:31Z) - Re-parameterizing Your Optimizers rather than Architectures [119.08740698936633]
We propose a novel paradigm of incorporating model-specific prior knowledge into Structurals and using them to train generic (simple) models.
As an implementation, we propose a novel methodology to add prior knowledge by modifying the gradients according to a set of model-specific hyper- parameters.
For a simple model trained with a Repr, we focus on a VGG-style plain model and showcase that such a simple model trained with a Repr, which is referred to as Rep-VGG, performs on par with the recent well-designed models.
arXiv Detail & Related papers (2022-05-30T16:55:59Z) - Towards Disentangling Information Paths with Coded ResNeXt [11.884259630414515]
We take a novel approach to enhance the transparency of the function of the whole network.
We propose a neural network architecture for classification, in which the information that is relevant to each class flows through specific paths.
arXiv Detail & Related papers (2022-02-10T21:45:49Z) - Tidying Deep Saliency Prediction Architectures [6.613005108411055]
In this paper, we identify four key components of saliency models, i.e., input features, multi-level integration, readout architecture, and loss functions.
We propose two novel end-to-end architectures called SimpleNet and MDNSal, which are neater, minimal, more interpretable and achieve state of the art performance on public saliency benchmarks.
arXiv Detail & Related papers (2020-03-10T19:34:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.