Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data
Augmentation
- URL: http://arxiv.org/abs/2305.13785v2
- Date: Fri, 20 Oct 2023 08:44:21 GMT
- Title: Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data
Augmentation
- Authors: Danqing Luo, Chen Zhang, Jiahui Xu, Bin Wang, Yiming Chen, Yan Zhang,
Haizhou Li
- Abstract summary: We show how to optimize few-shot text classification without accessing the gradients of the large-scale language models.
Our approach, dubbed BT-Classifier, significantly outperforms state-of-the-art black-box few-shot learners.
- Score: 42.05617728412819
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training or finetuning large-scale language models (LLMs) such as GPT-3
requires substantial computation resources, motivating recent efforts to
explore parameter-efficient adaptation to downstream tasks. One practical area
of research is to treat these models as black boxes and interact with them
through their inference APIs. In this paper, we investigate how to optimize
few-shot text classification without accessing the gradients of the LLMs. To
achieve this, we treat the black-box model as a feature extractor and train a
classifier with the augmented text data. Data augmentation is performed using
prompt-based finetuning on an auxiliary language model with a much smaller
parameter size than the black-box model. Through extensive experiments on eight
text classification datasets, we show that our approach, dubbed BT-Classifier,
significantly outperforms state-of-the-art black-box few-shot learners and
performs on par with methods that rely on full-model tuning.
Related papers
- Adapting Vision-Language Models to Open Classes via Test-Time Prompt Tuning [50.26965628047682]
Adapting pre-trained models to open classes is a challenging problem in machine learning.
In this paper, we consider combining the advantages of both and come up with a test-time prompt tuning approach.
Our proposed method outperforms all comparison methods on average considering both base and new classes.
arXiv Detail & Related papers (2024-08-29T12:34:01Z) - CrossTune: Black-Box Few-Shot Classification with Label Enhancement [40.88968135459357]
We introduce a label-enhanced cross-attention network called CrossTune to study black-box language model adaptation without prompt search.
Our proposed approach outperforms the previous state-of-the-art gradient-free black-box tuning method by 5.7% on average.
arXiv Detail & Related papers (2024-03-19T05:52:56Z) - Mafin: Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning [13.211063836237468]
We introduce Model augmented fine-tuning (Mafin) -- a novel approach for fine-tuning a black-box embedding model by augmenting it with a trainable embedding model.
Our results demonstrate that Mafin significantly enhances the performance of the black-box embeddings by only requiring the training of a small augmented model.
arXiv Detail & Related papers (2024-02-19T14:33:24Z) - Connecting the Dots: Collaborative Fine-tuning for Black-Box Vision-Language Models [121.0693322732454]
This paper proposes a textbfCraFT' approach for fine-tuning black-box vision-language models to downstream tasks.
CraFT comprises two modules, a prompt generation module for learning text prompts and a prediction refinement module for enhancing output predictions in residual style.
Experiments on few-shot classification over 15 datasets demonstrate the superiority of CraFT.
arXiv Detail & Related papers (2024-02-06T14:53:19Z) - Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning [13.964106147449051]
Existing solutions concentrate on fine-tuning the pre-trained models on conventional image datasets.
We propose a novel and effective framework based on learning Visual Prompts (VPT) in the pre-trained Vision Transformers (ViT)
We demonstrate that our new approximations with semantic information are superior to representative capabilities.
arXiv Detail & Related papers (2024-02-04T04:42:05Z) - Black-Box Tuning of Vision-Language Models with Effective Gradient
Approximation [71.21346469382821]
We introduce collaborative black-box tuning (CBBT) for both textual prompt optimization and output feature adaptation for black-box models.
CBBT is extensively evaluated on eleven downstream benchmarks and achieves remarkable improvements compared to existing black-box VL adaptation methods.
arXiv Detail & Related papers (2023-12-26T06:31:28Z) - The Languini Kitchen: Enabling Language Modelling Research at Different
Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours.
We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length.
This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z) - LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive
Prompt-Based Few-Shot Fine-Tuning [7.543506531838883]
This paper proposes LM- CPPF, Contrastive Paraphrasing-guided Prompt-based Fine-tuning of Language Models.
Our experiments on multiple text classification benchmarks show that this augmentation method outperforms other methods.
arXiv Detail & Related papers (2023-05-29T15:59:51Z) - Efficient Nearest Neighbor Language Models [114.40866461741795]
Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore.
We show how to achieve up to a 6x speed-up in inference speed while retaining comparable performance.
arXiv Detail & Related papers (2021-09-09T12:32:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.