LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive
Prompt-Based Few-Shot Fine-Tuning
- URL: http://arxiv.org/abs/2305.18169v3
- Date: Wed, 5 Jul 2023 09:15:55 GMT
- Title: LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive
Prompt-Based Few-Shot Fine-Tuning
- Authors: Amirhossein Abaskohi, Sascha Rothe, Yadollah Yaghoobzadeh
- Abstract summary: This paper proposes LM- CPPF, Contrastive Paraphrasing-guided Prompt-based Fine-tuning of Language Models.
Our experiments on multiple text classification benchmarks show that this augmentation method outperforms other methods.
- Score: 7.543506531838883
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, there has been significant progress in developing
pre-trained language models for NLP. However, these models often struggle when
fine-tuned on small datasets. To address this issue, researchers have proposed
various adaptation approaches. Prompt-based tuning is arguably the most common
way, especially for larger models. Previous research shows that adding
contrastive learning to prompt-based fine-tuning is effective as it helps the
model generate embeddings that are more distinguishable between classes, and it
can also be more sample-efficient as the model learns from positive and
negative examples simultaneously. One of the most important components of
contrastive learning is data augmentation, but unlike computer vision,
effective data augmentation for NLP is still challenging. This paper proposes
LM-CPPF, Contrastive Paraphrasing-guided Prompt-based Fine-tuning of Language
Models, which leverages prompt-based few-shot paraphrasing using generative
language models, especially large language models such as GPT-3 and OPT-175B,
for data augmentation. Our experiments on multiple text classification
benchmarks show that this augmentation method outperforms other methods, such
as easy data augmentation, back translation, and multiple templates.
Related papers
- CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning [4.004641316826348]
We introduce a novel language-image Contrastive Learning method with an Efficient large language model and prompt Fine-Tuning (CLEFT)
Our method demonstrates state-of-the-art performance on multiple chest X-ray and mammography datasets.
The proposed parameter efficient framework can reduce the total trainable model size by 39% and reduce the trainable language model to only 4% compared with the current BERT encoder.
arXiv Detail & Related papers (2024-07-30T17:57:32Z) - Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks.
We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z) - Joint Adaptive Representations for Image-Language Learning [59.40890927221377]
We propose a recipe for image-language learning, which produces effective models, outperforming bigger and more expensive ones, often trained on orders of magnitude larger datasets.
Our key finding is the joint learning of a compact vision and language representation, which adaptively and iteratively fuses the multi-modal features.
With only 40M training examples and with 39 GFLOPs our lightweight model outperforms many times larger state-of-the-art models of 2-20x more FLOPs and using bigger datasets some of which with close to 1B training examples.
arXiv Detail & Related papers (2023-05-31T15:02:02Z) - Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data
Augmentation [42.05617728412819]
We show how to optimize few-shot text classification without accessing the gradients of the large-scale language models.
Our approach, dubbed BT-Classifier, significantly outperforms state-of-the-art black-box few-shot learners.
arXiv Detail & Related papers (2023-05-23T07:54:34Z) - AugGPT: Leveraging ChatGPT for Text Data Augmentation [59.76140039943385]
We propose a text data augmentation approach based on ChatGPT (named AugGPT)
AugGPT rephrases each sentence in the training samples into multiple conceptually similar but semantically different samples.
Experiment results on few-shot learning text classification tasks show the superior performance of the proposed AugGPT approach.
arXiv Detail & Related papers (2023-02-25T06:58:16Z) - Few-shot Text Classification with Dual Contrastive Consistency [31.141350717029358]
In this paper, we explore how to utilize pre-trained language model to perform few-shot text classification.
We adopt supervised contrastive learning on few labeled data and consistency-regularization on vast unlabeled data.
arXiv Detail & Related papers (2022-09-29T19:26:23Z) - Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model.
In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z) - Efficient Nearest Neighbor Language Models [114.40866461741795]
Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore.
We show how to achieve up to a 6x speed-up in inference speed while retaining comparable performance.
arXiv Detail & Related papers (2021-09-09T12:32:28Z) - GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation [9.501648136713694]
Large-scale language models such as GPT-3 are excellent few-shot learners, allowing them to be controlled via natural text prompts.
This paper proposes a novel data augmentation technique that leverages large-scale language models to generate realistic text samples.
arXiv Detail & Related papers (2021-04-18T11:39:33Z) - SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation.
Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.