Data Augmentation for Intent Classification with Off-the-shelf Large
Language Models
- URL: http://arxiv.org/abs/2204.01959v1
- Date: Tue, 5 Apr 2022 03:29:26 GMT
- Title: Data Augmentation for Intent Classification with Off-the-shelf Large
Language Models
- Authors: Gaurav Sahu, Pau Rodriguez, Issam H. Laradji, Parmida Atighehchian,
David Vazquez, Dzmitry Bahdanau
- Abstract summary: We propose a prompting-based approach to generate labelled training data for intent classification with off-the-shelf language models.
We evaluate the proposed method in a few-shot setting on four diverse intent classification tasks.
- Score: 13.895236210726202
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data augmentation is a widely employed technique to alleviate the problem of
data scarcity. In this work, we propose a prompting-based approach to generate
labelled training data for intent classification with off-the-shelf language
models (LMs) such as GPT-3. An advantage of this method is that no
task-specific LM-fine-tuning for data generation is required; hence the method
requires no hyper-parameter tuning and is applicable even when the available
training data is very scarce. We evaluate the proposed method in a few-shot
setting on four diverse intent classification tasks. We find that GPT-generated
data significantly boosts the performance of intent classifiers when intents in
consideration are sufficiently distinct from each other. In tasks with
semantically close intents, we observe that the generated data is less helpful.
Our analysis shows that this is because GPT often generates utterances that
belong to a closely-related intent instead of the desired one. We present
preliminary evidence that a prompting-based GPT classifier could be helpful in
filtering the generated data to enhance its quality.
Related papers
- Zero-Shot Stance Detection using Contextual Data Generation with LLMs [0.04096453902709291]
We propose Dynamic Model Adaptation with Contextual Data Generation (DyMoAdapt)
In this approach, we aim to fine-tune an existing model at test time.
We achieve this by generating new topic-specific data using GPT-3.
This method could enhance performance by allowing the adaptation of the model to new topics.
arXiv Detail & Related papers (2024-05-19T17:58:26Z) - Text generation for dataset augmentation in security classification
tasks [55.70844429868403]
This study evaluates the application of natural language text generators to fill this data gap in multiple security-related text classification tasks.
We find substantial benefits for GPT-3 data augmentation strategies in situations with severe limitations on known positive-class samples.
arXiv Detail & Related papers (2023-10-22T22:25:14Z) - Revisit Few-shot Intent Classification with PLMs: Direct Fine-tuning vs. Continual Pre-training [20.98770732015944]
Few-shot intent detection involves training a deep learning model to classify utterances based on their underlying intents using only a small amount of labeled data.
We show that continual pre-training may not be essential, since the overfitting problem of PLMs on this task may not be as serious as expected.
To maximize the utilization of the limited available data, we propose a context augmentation method and leverage sequential self-distillation to boost performance.
arXiv Detail & Related papers (2023-06-08T15:26:52Z) - Going beyond research datasets: Novel intent discovery in the industry
setting [60.90117614762879]
This paper proposes methods to improve the intent discovery pipeline deployed in a large e-commerce platform.
We show the benefit of pre-training language models on in-domain data: both self-supervised and with weak supervision.
We also devise the best method to utilize the conversational structure (i.e., question and answer) of real-life datasets during fine-tuning for clustering tasks, which we call Conv.
arXiv Detail & Related papers (2023-05-09T14:21:29Z) - ZeroShotDataAug: Generating and Augmenting Training Data with ChatGPT [2.320417845168326]
We investigate the use of data obtained from prompting a large generative language model, ChatGPT, to generate synthetic training data with the aim of augmenting data in low resource scenarios.
We show that with appropriate task-specific ChatGPT prompts, we outperform the most popular existing approaches for such data augmentation.
arXiv Detail & Related papers (2023-04-27T17:07:29Z) - AugGPT: Leveraging ChatGPT for Text Data Augmentation [59.76140039943385]
We propose a text data augmentation approach based on ChatGPT (named AugGPT)
AugGPT rephrases each sentence in the training samples into multiple conceptually similar but semantically different samples.
Experiment results on few-shot learning text classification tasks show the superior performance of the proposed AugGPT approach.
arXiv Detail & Related papers (2023-02-25T06:58:16Z) - Selective In-Context Data Augmentation for Intent Detection using
Pointwise V-Information [100.03188187735624]
We introduce a novel approach based on PLMs and pointwise V-information (PVI), a metric that can measure the usefulness of a datapoint for training a model.
Our method first fine-tunes a PLM on a small seed of training data and then synthesizes new datapoints - utterances that correspond to given intents.
Our method is thus able to leverage the expressive power of large language models to produce diverse training data.
arXiv Detail & Related papers (2023-02-10T07:37:49Z) - Training Dynamic based data filtering may not work for NLP datasets [0.0]
We study the applicability of the Area Under the Margin (AUM) metric to identify mislabelled examples in NLP datasets.
We find that mislabelled samples can be filtered using the AUM metric in NLP datasets but it also removes a significant number of correctly labeled points.
arXiv Detail & Related papers (2021-09-19T18:50:45Z) - CADDA: Class-wise Automatic Differentiable Data Augmentation for EEG
Signals [92.60744099084157]
We propose differentiable data augmentation amenable to gradient-based learning.
We demonstrate the relevance of our approach on the clinically relevant sleep staging classification task.
arXiv Detail & Related papers (2021-06-25T15:28:48Z) - DAGA: Data Augmentation with a Generation Approach for Low-resource
Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences.
Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.