BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning
- URL: http://arxiv.org/abs/2303.14773v2
- Date: Sat, 8 Jul 2023 12:13:50 GMT
- Title: BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning
- Authors: Changdae Oh, Hyeji Hwang, Hee-young Lee, YongTaek Lim, Geunyoung Jung,
Jiyoung Jung, Hosik Choi, Kyungwoo Song
- Abstract summary: BlackVIP enables robust adaptation to diverse domains without accessing PTMs' parameters, with minimal memory requirements.
Experiments on 16 datasets demonstrate that BlackVIP enables robust adaptation to diverse domains without accessing PTMs' parameters, with minimal memory requirements.
- Score: 10.351343954359677
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the surge of large-scale pre-trained models (PTMs), fine-tuning these
models to numerous downstream tasks becomes a crucial problem. Consequently,
parameter efficient transfer learning (PETL) of large models has grasped huge
attention. While recent PETL methods showcase impressive performance, they rely
on optimistic assumptions: 1) the entire parameter set of a PTM is available,
and 2) a sufficiently large memory capacity for the fine-tuning is equipped.
However, in most real-world applications, PTMs are served as a black-box API or
proprietary software without explicit parameter accessibility. Besides, it is
hard to meet a large memory requirement for modern PTMs. In this work, we
propose black-box visual prompting (BlackVIP), which efficiently adapts the
PTMs without knowledge about model architectures and parameters. BlackVIP has
two components; 1) Coordinator and 2) simultaneous perturbation stochastic
approximation with gradient correction (SPSA-GC). The Coordinator designs
input-dependent image-shaped visual prompts, which improves few-shot adaptation
and robustness on distribution/location shift. SPSA-GC efficiently estimates
the gradient of a target model to update Coordinator. Extensive experiments on
16 datasets demonstrate that BlackVIP enables robust adaptation to diverse
domains without accessing PTMs' parameters, with minimal memory requirements.
Code: \url{https://github.com/changdaeoh/BlackVIP}
Related papers
- Robust Adaptation of Foundation Models with Black-Box Visual Prompting [18.192496572620424]
This work proposes black-box visual prompting (BlackVIP) to efficiently adapt large-scale pre-trained models (PTMs)
BlackVIP has two components; 1) Coordinator and 2) simultaneous approximation with gradient correction (SPSA-GC)
Experiments on 19 datasets demonstrate that BlackVIPs enable robust adaptation to diverse domains and tasks with minimal memory requirements.
arXiv Detail & Related papers (2024-07-04T02:35:00Z) - Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision [52.80792724919329]
We introduce a novel framework named Adapter-X to improve fine-tuning in 2D image and 3D point cloud modalities.
It is the first to outperform full fine-tuning in both 2D image and 3D point cloud modalities with significantly fewer parameters, i.e., only 0.20% and 1.88% of original trainable parameters for 2D and 3D classification tasks.
arXiv Detail & Related papers (2024-06-05T08:26:44Z) - BitDelta: Your Fine-Tune May Only Be Worth One Bit [57.558376557639555]
Large Language Models (LLMs) are typically trained in two phases: pre-training on large internet-scale datasets, and fine-tuning for downstream tasks.
We introduce a simple method, BitDelta, which successfully quantizes this delta down to 1 bit without compromising performance.
By enabling the use of a single high-precision base model accompanied by multiple 1-bit deltas, BitDelta dramatically reduces GPU memory requirements by more than 10x.
arXiv Detail & Related papers (2024-02-15T18:50:06Z) - Black-Box Tuning of Vision-Language Models with Effective Gradient
Approximation [71.21346469382821]
We introduce collaborative black-box tuning (CBBT) for both textual prompt optimization and output feature adaptation for black-box models.
CBBT is extensively evaluated on eleven downstream benchmarks and achieves remarkable improvements compared to existing black-box VL adaptation methods.
arXiv Detail & Related papers (2023-12-26T06:31:28Z) - Efficient Federated Prompt Tuning for Black-box Large Pre-trained Models [62.838689691468666]
We propose Federated Black-Box Prompt Tuning (Fed-BBPT) to optimally harness each local dataset.
Fed-BBPT capitalizes on a central server that aids local users in collaboratively training a prompt generator through regular aggregation.
Relative to extensive fine-tuning, Fed-BBPT proficiently sidesteps memory challenges tied to PTM storage and fine-tuning on local machines.
arXiv Detail & Related papers (2023-10-04T19:30:49Z) - DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning [14.975436239088312]
We propose DePT, which decomposes the soft prompt into a shorter soft prompt and a pair of low-rank matrices that are then optimised with two different learning rates.
We demonstrate that DePT outperforms state-of-the-art PEFT approaches, including the full fine-tuning baseline, in some scenarios.
arXiv Detail & Related papers (2023-09-11T00:02:05Z) - Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need [84.3507610522086]
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones.
Recent pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL.
We argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring.
arXiv Detail & Related papers (2023-03-13T17:59:02Z) - Black-Box Tuning for Language-Model-as-a-Service [85.2210372920386]
This work proposes the Black-Box Tuning to optimize PTMs through derivative-free algorithms.
In particular, we invoke the CMA-ES to optimize the continuous prompt prepended to the input text by iteratively calling PTM inference APIs.
Our experimental results demonstrate that, black-box tuning with RoBERTa on a few labeled samples not only significantly outperforms manual prompt and GPT-3's in-context learning, but also surpasses the gradient-based counterparts.
arXiv Detail & Related papers (2022-01-10T18:17:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.