Black-Box Tuning for Language-Model-as-a-Service
- URL: http://arxiv.org/abs/2201.03514v1
- Date: Mon, 10 Jan 2022 18:17:05 GMT
- Title: Black-Box Tuning for Language-Model-as-a-Service
- Authors: Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
- Abstract summary: This work proposes the Black-Box Tuning to optimize PTMs through derivative-free algorithms.
In particular, we invoke the CMA-ES to optimize the continuous prompt prepended to the input text by iteratively calling PTM inference APIs.
Our experimental results demonstrate that, black-box tuning with RoBERTa on a few labeled samples not only significantly outperforms manual prompt and GPT-3's in-context learning, but also surpasses the gradient-based counterparts.
- Score: 85.2210372920386
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Extremely large pre-trained language models (PTMs) such as GPT-3 are usually
released as a service, allowing users to design task-specific prompts to query
the PTMs through some black-box APIs. In such a scenario, which we call
Language-Model-as-a-Service (LMaaS), gradients of the PTMs are usually not
available. Can we optimize the task prompts by only accessing the model
inference APIs? Based on recent observations that large PTMs have a very low
intrinsic dimensionality, this work proposes the Black-Box Tuning to optimize
PTMs through derivative-free algorithms. In particular, we invoke the CMA-ES to
optimize the continuous prompt prepended to the input text by iteratively
calling PTM inference APIs. Our experimental results demonstrate that,
black-box tuning with RoBERTa on a few labeled samples not only significantly
outperforms manual prompt and GPT-3's in-context learning, but also surpasses
the gradient-based counterparts, namely prompt tuning and full model tuning.
Related papers
- Robust Adaptation of Foundation Models with Black-Box Visual Prompting [18.192496572620424]
This work proposes black-box visual prompting (BlackVIP) to efficiently adapt large-scale pre-trained models (PTMs)
BlackVIP has two components; 1) Coordinator and 2) simultaneous approximation with gradient correction (SPSA-GC)
Experiments on 19 datasets demonstrate that BlackVIPs enable robust adaptation to diverse domains and tasks with minimal memory requirements.
arXiv Detail & Related papers (2024-07-04T02:35:00Z) - CPT: Consistent Proxy Tuning for Black-box Optimization [63.06335358432746]
Proxy-tuning provides a test-time output adjustment for tuning black-box language models.
We introduce Consistent Proxy Tuning (CPT), a simple yet effective black-box tuning method.
CPT exploits the frozen large black-box model and another frozen small white-box model, ensuring consistency between training-stage optimization objective and test-time proxies.
arXiv Detail & Related papers (2024-07-01T10:23:14Z) - Black-Box Tuning of Vision-Language Models with Effective Gradient
Approximation [71.21346469382821]
We introduce collaborative black-box tuning (CBBT) for both textual prompt optimization and output feature adaptation for black-box models.
CBBT is extensively evaluated on eleven downstream benchmarks and achieves remarkable improvements compared to existing black-box VL adaptation methods.
arXiv Detail & Related papers (2023-12-26T06:31:28Z) - Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data
Augmentation [42.05617728412819]
We show how to optimize few-shot text classification without accessing the gradients of the large-scale language models.
Our approach, dubbed BT-Classifier, significantly outperforms state-of-the-art black-box few-shot learners.
arXiv Detail & Related papers (2023-05-23T07:54:34Z) - Make Prompt-based Black-Box Tuning Colorful: Boosting Model Generalization from Three Orthogonal Perspectives [28.138689389803034]
Large language models (LLMs) have shown increasing power on various natural language processing (NLP) tasks.
Black-box tuning has been proposed to address this problem by optimizing task-specific prompts without accessing the gradients and hidden representations.
We describe BBT-RGB, a suite of straightforward and complementary techniques for enhancing the efficiency and performance of black-box optimization.
arXiv Detail & Related papers (2023-05-14T07:33:59Z) - Decoder Tuning: Efficient Language Understanding as Decoding [84.68266271483022]
We present Decoder Tuning (DecT), which in contrast optimize task-specific decoder networks on the output side.
By gradient-based optimization, DecT can be trained within several seconds and requires only one P query per sample.
We conduct extensive natural language understanding experiments and show that DecT significantly outperforms state-of-the-art algorithms with a $200times$ speed-up.
arXiv Detail & Related papers (2022-12-16T11:15:39Z) - BBTv2: Pure Black-Box Optimization Can Be Comparable to Gradient Descent
for Few-Shot Learning [83.26610968655815]
Black-Box Tuning is a derivative-free approach to optimize continuous prompt tokens prepended to the input of language models.
We present BBTv2, a pure black-box optimization approach that can drive language models to achieve comparable results to gradient-based optimization.
arXiv Detail & Related papers (2022-05-23T11:10:19Z) - Prompt Tuning for Discriminative Pre-trained Language Models [96.04765512463415]
Recent works have shown promising results of prompt tuning in stimulating pre-trained language models (PLMs) for natural language processing (NLP) tasks.
It is still unknown whether and how discriminative PLMs, e.g., ELECTRA, can be effectively prompt-tuned.
We present DPT, the first prompt tuning framework for discriminative PLMs, which reformulates NLP tasks into a discriminative language modeling problem.
arXiv Detail & Related papers (2022-05-23T10:11:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.