Related papers: Generate then Refine: Data Augmentation for Zero-shot Intent Detection

Generate then Refine: Data Augmentation for Zero-shot Intent Detection

URL: http://arxiv.org/abs/2410.01953v2
Date: Tue, 15 Oct 2024 07:02:05 GMT
Title: Generate then Refine: Data Augmentation for Zero-shot Intent Detection
Authors: I-Fan Lin, Faegheh Hasibi, Suzan Verberne,
Abstract summary: We propose a data augmentation method for intent detection in zero-resource domains. We generate utterances for intent labels using an open-source large language model in a zero-shot setting. Second, we develop a smaller sequence-to-sequence model to improve the generated utterances.
Score: 5.257115841810258
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this short paper we propose a data augmentation method for intent detection in zero-resource domains. Existing data augmentation methods rely on few labelled examples for each intent category, which can be expensive in settings with many possible intents. We use a two-stage approach: First, we generate utterances for intent labels using an open-source large language model in a zero-shot setting. Second, we develop a smaller sequence-to-sequence model (the Refiner), to improve the generated utterances. The Refiner is fine-tuned on seen domains and then applied to unseen domains. We evaluate our method by training an intent classifier on the generated data, and evaluating it on real (human) data. We find that the Refiner significantly improves the data utility and diversity over the zero-shot LLM baseline for unseen domains and over common baseline approaches. Our results indicate that a two-step approach of a generative LLM in zero-shot setting and a smaller sequence-to-sequence model can provide high-quality data for intent detection.

Related papers

How DDAIR you? Disambiguated Data Augmentation for Intent Recognition [0.3997220396722048]
Large Language Models (LLMs) are effective for data augmentation in classification tasks like intent detection.<n>LLMs inadvertently produce examples that are ambiguous with regard to untargeted classes.<n>We present DDAIR (Disambiguated Data Augmentation for Intent Recognition) to mitigate this problem.
arXiv Detail & Related papers (2026-01-16T12:26:55Z)
Learning What NOT to Count [17.581015609730017]
Few/zero-shot object counting methods often struggle to distinguish between fine-grained categories. We propose an annotation-free approach that enables the seamless integration of new fine-grained categories into existing few/zero-shot counting models. Our approach introduces an attention prediction network that identifies fine-grained category boundaries trained using only synthetic pseudo-annotated data.
arXiv Detail & Related papers (2025-04-16T02:05:47Z)
SPILL: Domain-Adaptive Intent Clustering based on Selection and Pooling with Large Language Models [5.257115841810258]
Selection and Pooling with Large Language Models (SPILL) is an intuitive and domain-adaptive method for intent clustering without fine-tuning. Our goal is to make existing embedders more generalizable to new domain datasets without further fine-tuning. Our method achieves comparable results to other state-of-the-art studies, even those that use much larger models and require fine-tuning.
arXiv Detail & Related papers (2025-03-19T15:48:57Z)
ExaRanker-Open: Synthetic Explanation for IR using Open-Source LLMs [60.81649785463651]
We introduce ExaRanker-Open, where we adapt and explore the use of open-source language models to generate explanations. Our findings reveal that incorporating explanations consistently enhances neural rankers, with benefits escalating as the LLM size increases.
arXiv Detail & Related papers (2024-02-09T11:23:14Z)
Going beyond research datasets: Novel intent discovery in the industry setting [60.90117614762879]
This paper proposes methods to improve the intent discovery pipeline deployed in a large e-commerce platform. We show the benefit of pre-training language models on in-domain data: both self-supervised and with weak supervision. We also devise the best method to utilize the conversational structure (i.e., question and answer) of real-life datasets during fine-tuning for clustering tasks, which we call Conv.
arXiv Detail & Related papers (2023-05-09T14:21:29Z)
Text2Seg: Remote Sensing Image Semantic Segmentation via Text-Guided Visual Foundation Models [7.452422412106768]
We propose a novel method named Text2Seg for remote sensing semantic segmentation. It overcomes the dependency on extensive annotations by employing an automatic prompt generation process. We show that Text2Seg significantly improves zero-shot prediction performance compared to the vanilla SAM model.
arXiv Detail & Related papers (2023-04-20T18:39:41Z)
Knowledge Combination to Learn Rotated Detection Without Rotated Annotation [53.439096583978504]
Rotated bounding boxes drastically reduce output ambiguity of elongated objects. Despite the effectiveness, rotated detectors are not widely employed. We propose a framework that allows the model to predict precise rotated boxes.
arXiv Detail & Related papers (2023-04-05T03:07:36Z)
Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information [100.03188187735624]
We introduce a novel approach based on PLMs and pointwise V-information (PVI), a metric that can measure the usefulness of a datapoint for training a model. Our method first fine-tunes a PLM on a small seed of training data and then synthesizes new datapoints - utterances that correspond to given intents. Our method is thus able to leverage the expressive power of large language models to produce diverse training data.
arXiv Detail & Related papers (2023-02-10T07:37:49Z)
Data Augmentation for Intent Classification with Off-the-shelf Large Language Models [13.895236210726202]
We propose a prompting-based approach to generate labelled training data for intent classification with off-the-shelf language models. We evaluate the proposed method in a few-shot setting on four diverse intent classification tasks.
arXiv Detail & Related papers (2022-04-05T03:29:26Z)
Low-confidence Samples Matter for Domain Adaptation [47.552605279925736]
Domain adaptation (DA) aims to transfer knowledge from a label-rich source domain to a related but label-scarce target domain. We propose a novel contrastive learning method by processing low-confidence samples. We evaluate the proposed method in both unsupervised and semi-supervised DA settings.
arXiv Detail & Related papers (2022-02-06T15:45:45Z)
Enhancing the Generalization for Intent Classification and Out-of-Domain Detection in SLU [70.44344060176952]
Intent classification is a major task in spoken language understanding (SLU) Recent works have shown that using extra data and labels can improve the OOD detection performance. This paper proposes to train a model with only IND data while supporting both IND intent classification and OOD detection.
arXiv Detail & Related papers (2021-06-28T08:27:38Z)
OVANet: One-vs-All Network for Universal Domain Adaptation [78.86047802107025]
Existing methods manually set a threshold to reject unknown samples based on validation or a pre-defined ratio of unknown samples. We propose a method to learn the threshold using source samples and to adapt it to the target domain. Our idea is that a minimum inter-class distance in the source domain should be a good threshold to decide between known or unknown in the target.
arXiv Detail & Related papers (2021-04-07T18:36:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.