Generative Conversational Networks
- URL: http://arxiv.org/abs/2106.08484v1
- Date: Tue, 15 Jun 2021 23:19:37 GMT
- Title: Generative Conversational Networks
- Authors: Alexandros Papangelis and Karthik Gopalakrishnan and Aishwarya
Padmakumar and Seokhwan Kim and Gokhan Tur and Dilek Hakkani-Tur
- Abstract summary: We propose a framework called Generative Conversational Networks, in which conversational agents learn to generate their own labelled training data.
We show an average improvement of 35% in intent detection and 21% in slot tagging over a baseline model trained from the seed data.
- Score: 67.13144697969501
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inspired by recent work in meta-learning and generative teaching networks, we
propose a framework called Generative Conversational Networks, in which
conversational agents learn to generate their own labelled training data (given
some seed data) and then train themselves from that data to perform a given
task. We use reinforcement learning to optimize the data generation process
where the reward signal is the agent's performance on the task. The task can be
any language-related task, from intent detection to full task-oriented
conversations. In this work, we show that our approach is able to generalise
from seed data and performs well in limited data and limited computation
settings, with significant gains for intent detection and slot tagging across
multiple datasets: ATIS, TOD, SNIPS, and Restaurants8k. We show an average
improvement of 35% in intent detection and 21% in slot tagging over a baseline
model trained from the seed data. We also conduct an analysis of the novelty of
the generated data and provide generated examples for intent detection, slot
tagging, and non-goal oriented conversations.
Related papers
- Knowledge Combination to Learn Rotated Detection Without Rotated
Annotation [53.439096583978504]
Rotated bounding boxes drastically reduce output ambiguity of elongated objects.
Despite the effectiveness, rotated detectors are not widely employed.
We propose a framework that allows the model to predict precise rotated boxes.
arXiv Detail & Related papers (2023-04-05T03:07:36Z) - Selective In-Context Data Augmentation for Intent Detection using
Pointwise V-Information [100.03188187735624]
We introduce a novel approach based on PLMs and pointwise V-information (PVI), a metric that can measure the usefulness of a datapoint for training a model.
Our method first fine-tunes a PLM on a small seed of training data and then synthesizes new datapoints - utterances that correspond to given intents.
Our method is thus able to leverage the expressive power of large language models to produce diverse training data.
arXiv Detail & Related papers (2023-02-10T07:37:49Z) - FETA: A Benchmark for Few-Sample Task Transfer in Open-Domain Dialogue [70.65782786401257]
This work explores conversational task transfer by introducing FETA: a benchmark for few-sample task transfer in open-domain dialogue.
FETA contains two underlying sets of conversations upon which there are 10 and 7 tasks annotated, enabling the study of intra-dataset task transfer.
We utilize three popular language models and three learning algorithms to analyze the transferability between 132 source-target task pairs.
arXiv Detail & Related papers (2022-05-12T17:59:00Z) - Data Augmentation with Paraphrase Generation and Entity Extraction for
Multimodal Dialogue System [9.912419882236918]
We are working towards a multimodal dialogue system for younger kids learning basic math concepts.
This work explores the potential benefits of data augmentation with paraphrase generation for the Natural Language Understanding module of the Spoken Dialogue Systems pipeline.
We have shown that paraphrasing with model-in-the-loop (MITL) strategies using small seed data is a promising approach yielding improved performance results for the Intent Recognition task.
arXiv Detail & Related papers (2022-05-09T02:21:20Z) - A Semi-Supervised Deep Clustering Pipeline for Mining Intentions From
Texts [6.599344783327053]
Verint Manager Intent (VIM) is an analysis platform that combines unsupervised and semi-supervised approaches to help analysts quickly surface and organize relevant user intentions from conversational texts.
For the initial exploration of data we make use of a novel unsupervised and semi-supervised pipeline that integrates the fine-tuning of high performing language models.
BERT produces better task-aware representations using a labeled subset as small as 0.5% of the task data.
arXiv Detail & Related papers (2022-02-01T23:01:05Z) - On the Use of External Data for Spoken Named Entity Recognition [40.93448412171246]
Recent advances in self-supervised speech representations have made it feasible to consider learning models with limited labeled data.
We draw on a variety of approaches, including self-training, knowledge distillation, and transfer learning, and consider their applicability to both end-to-end models and pipeline approaches.
arXiv Detail & Related papers (2021-12-14T18:49:26Z) - ProtoDA: Efficient Transfer Learning for Few-Shot Intent Classification [21.933876113300897]
We adopt an alternative approach by transfer learning on an ensemble of related tasks using prototypical networks under the meta-learning paradigm.
Using intent classification as a case study, we demonstrate that increasing variability in training tasks can significantly improve classification performance.
arXiv Detail & Related papers (2021-01-28T00:19:13Z) - Intent Detection with WikiHow [28.28719498563396]
Our models are able to predict a broad range of intended goals from many actions because they are trained on wikiHow.
Our models achieve state-of-the-art results on the Snips dataset, theGuided Dialogue dataset, and all 3 languages of the Facebook multilingual dialog datasets.
arXiv Detail & Related papers (2020-09-12T12:53:47Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z) - Improving Multi-Turn Response Selection Models with Complementary
Last-Utterance Selection by Instance Weighting [84.9716460244444]
We consider utilizing the underlying correlation in the data resource itself to derive different kinds of supervision signals.
We conduct extensive experiments in two public datasets and obtain significant improvement in both datasets.
arXiv Detail & Related papers (2020-02-18T06:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.