Selective In-Context Data Augmentation for Intent Detection using
Pointwise V-Information
- URL: http://arxiv.org/abs/2302.05096v1
- Date: Fri, 10 Feb 2023 07:37:49 GMT
- Title: Selective In-Context Data Augmentation for Intent Detection using
Pointwise V-Information
- Authors: Yen-Ting Lin, Alexandros Papangelis, Seokhwan Kim, Sungjin Lee,
Devamanyu Hazarika, Mahdi Namazifar, Di Jin, Yang Liu, Dilek Hakkani-Tur
- Abstract summary: We introduce a novel approach based on PLMs and pointwise V-information (PVI), a metric that can measure the usefulness of a datapoint for training a model.
Our method first fine-tunes a PLM on a small seed of training data and then synthesizes new datapoints - utterances that correspond to given intents.
Our method is thus able to leverage the expressive power of large language models to produce diverse training data.
- Score: 100.03188187735624
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work focuses on in-context data augmentation for intent detection.
Having found that augmentation via in-context prompting of large pre-trained
language models (PLMs) alone does not improve performance, we introduce a novel
approach based on PLMs and pointwise V-information (PVI), a metric that can
measure the usefulness of a datapoint for training a model. Our method first
fine-tunes a PLM on a small seed of training data and then synthesizes new
datapoints - utterances that correspond to given intents. It then employs
intent-aware filtering, based on PVI, to remove datapoints that are not helpful
to the downstream intent classifier. Our method is thus able to leverage the
expressive power of large language models to produce diverse training data.
Empirical results demonstrate that our method can produce synthetic training
data that achieve state-of-the-art performance on three challenging intent
detection datasets under few-shot settings (1.28% absolute improvement in
5-shot and 1.18% absolute in 10-shot, on average) and perform on par with the
state-of-the-art in full-shot settings (within 0.01% absolute, on average).
Related papers
- How to Train Data-Efficient LLMs [56.41105687693619]
We study data-efficient approaches for pre-training language models (LLMs)
We find that Ask-LLM and Density sampling are the best methods in their respective categories.
In our comparison of 19 samplers, involving hundreds of evaluation tasks and pre-training runs, we find that Ask-LLM and Density are the best methods in their respective categories.
arXiv Detail & Related papers (2024-02-15T02:27:57Z) - DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection [72.25697820290502]
This work introduces a straightforward and efficient strategy to identify potential novel classes through zero-shot classification.
We refer to this approach as the self-training strategy, which enhances recall and accuracy for novel classes without requiring extra annotations, datasets, and re-training.
Empirical evaluations on three datasets, including LVIS, V3Det, and COCO, demonstrate significant improvements over the baseline performance.
arXiv Detail & Related papers (2023-10-02T17:52:24Z) - Approximating Human-Like Few-shot Learning with GPT-based Compression [55.699707962017975]
We seek to equip generative pre-trained models with human-like learning capabilities that enable data compression during inference.
We present a novel approach that utilizes the Generative Pre-trained Transformer (GPT) to approximate Kolmogorov complexity.
arXiv Detail & Related papers (2023-08-14T05:22:33Z) - Revisiting Sample Size Determination in Natural Language Understanding [18.637079595450366]
Knowing exactly how many data points need to be labeled to achieve a certain model performance is a beneficial step towards reducing the overall budgets for annotation.
We derived a simple yet effective approach to predict the maximum achievable model performance based on small amount of training samples.
arXiv Detail & Related papers (2023-07-01T16:08:52Z) - Revisit Few-shot Intent Classification with PLMs: Direct Fine-tuning vs. Continual Pre-training [20.98770732015944]
Few-shot intent detection involves training a deep learning model to classify utterances based on their underlying intents using only a small amount of labeled data.
We show that continual pre-training may not be essential, since the overfitting problem of PLMs on this task may not be as serious as expected.
To maximize the utilization of the limited available data, we propose a context augmentation method and leverage sequential self-distillation to boost performance.
arXiv Detail & Related papers (2023-06-08T15:26:52Z) - Data Augmentation for Intent Classification with Off-the-shelf Large
Language Models [13.895236210726202]
We propose a prompting-based approach to generate labelled training data for intent classification with off-the-shelf language models.
We evaluate the proposed method in a few-shot setting on four diverse intent classification tasks.
arXiv Detail & Related papers (2022-04-05T03:29:26Z) - Dense Contrastive Learning for Self-Supervised Visual Pre-Training [102.15325936477362]
We present dense contrastive learning, which implements self-supervised learning by optimizing a pairwise contrastive (dis)similarity loss at the pixel level between two views of input images.
Compared to the baseline method MoCo-v2, our method introduces negligible computation overhead (only 1% slower)
arXiv Detail & Related papers (2020-11-18T08:42:32Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.