Data-Efficient Methods for Dialogue Systems
- URL: http://arxiv.org/abs/2012.02929v1
- Date: Sat, 5 Dec 2020 02:51:09 GMT
- Title: Data-Efficient Methods for Dialogue Systems
- Authors: Igor Shalyminov
- Abstract summary: Conversational User Interface (CUI) has become ubiquitous in everyday life, in consumer-focused products like Siri and Alexa.
Deep learning underlies many recent breakthroughs in dialogue systems but requires very large amounts of training data, often annotated by experts.
In this thesis, we introduce a series of methods for training robust dialogue systems from minimal data.
- Score: 4.061135251278187
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conversational User Interface (CUI) has become ubiquitous in everyday life,
in consumer-focused products like Siri and Alexa or business-oriented
solutions. Deep learning underlies many recent breakthroughs in dialogue
systems but requires very large amounts of training data, often annotated by
experts. Trained with smaller data, these methods end up severely lacking
robustness (e.g. to disfluencies and out-of-domain input), and often just have
too little generalisation power. In this thesis, we address the above issues by
introducing a series of methods for training robust dialogue systems from
minimal data. Firstly, we study two orthogonal approaches to dialogue:
linguistically informed and machine learning-based - from the data efficiency
perspective. We outline the steps to obtain data-efficient solutions with
either approach. We then introduce two data-efficient models for dialogue
response generation: the Dialogue Knowledge Transfer Network based on latent
variable dialogue representations, and the hybrid Generative-Retrieval
Transformer model (ranked first at the DSTC 8 Fast Domain Adaptation task).
Next, we address the problem of robustness given minimal data. As such, propose
a multitask LSTM-based model for domain-general disfluency detection. For the
problem of out-of-domain input, we present Turn Dropout, a data augmentation
technique for anomaly detection only using in-domain data, and introduce
autoencoder-augmented models for efficient training with Turn Dropout. Finally,
we focus on social dialogue and introduce a neural model for response ranking
in social conversation used in Alana, the 3rd place winner in the Amazon Alexa
Prize 2017 and 2018. We employ a novel technique of predicting the dialogue
length as the main ranking objective and show that this approach improves upon
the ratings-based counterpart in terms of data efficiency while matching it in
performance.
Related papers
- Towards a Zero-Data, Controllable, Adaptive Dialog System [27.75972750138208]
We explore approaches to generate data directly from dialog trees.
We show that agents trained on synthetic data can achieve comparable dialog success to models trained on human data.
arXiv Detail & Related papers (2024-03-26T10:45:11Z) - DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization [127.714919036388]
DIONYSUS is a pre-trained encoder-decoder model for summarizing dialogues in any new domain.
Our experiments show that DIONYSUS outperforms existing methods on six datasets.
arXiv Detail & Related papers (2022-12-20T06:21:21Z) - Weakly Supervised Data Augmentation Through Prompting for Dialogue
Understanding [103.94325597273316]
We present a novel approach that iterates on augmentation quality by applying weakly-supervised filters.
We evaluate our methods on the emotion and act classification tasks in DailyDialog and the intent classification task in Facebook Multilingual Task-Oriented Dialogue.
For DailyDialog specifically, using 10% of the ground truth data we outperform the current state-of-the-art model which uses 100% of the data.
arXiv Detail & Related papers (2022-10-25T17:01:30Z) - A Model-Agnostic Data Manipulation Method for Persona-based Dialogue
Generation [107.82729587882397]
It is expensive to scale up current persona-based dialogue datasets.
Each data sample in this task is more complex to learn with than conventional dialogue data.
We propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model.
arXiv Detail & Related papers (2022-04-21T03:49:54Z) - Quick Starting Dialog Systems with Paraphrase Generation [0.0]
We propose a method to reduce the cost and effort of creating new conversational agents by artificially generating more data from existing examples.
Our proposed approach can kick-start a dialog system with little human effort, and brings its performance to a level satisfactory enough for allowing actual interactions with real end-users.
arXiv Detail & Related papers (2022-04-06T02:35:59Z) - Smoothing Dialogue States for Open Conversational Machine Reading [70.83783364292438]
We propose an effective gating strategy by smoothing the two dialogue states in only one decoder and bridge decision making and question generation.
Experiments on the OR-ShARC dataset show the effectiveness of our method, which achieves new state-of-the-art results.
arXiv Detail & Related papers (2021-08-28T08:04:28Z) - Dialogue Distillation: Open-Domain Dialogue Augmentation Using Unpaired
Data [61.71319905364992]
We propose a novel data augmentation method for training open-domain dialogue models by utilizing unpaired data.
A data-level distillation process is first proposed to construct augmented dialogues where both post and response are retrieved from the unpaired data.
A ranking module is employed to filter out low-quality dialogues.
A model-level distillation process is employed to distill a teacher model trained on high-quality paired data to augmented dialogue pairs.
arXiv Detail & Related papers (2020-09-20T13:06:38Z) - Hybrid Generative-Retrieval Transformers for Dialogue Domain Adaptation [77.62366712130196]
We present the winning entry at the fast domain adaptation task of DSTC8, a hybrid generative-retrieval model based on GPT-2 fine-tuned to the multi-domain MetaLWOz dataset.
Our model uses retrieval logic as a fallback, being SoTA on MetaLWOz in human evaluation (>4% improvement over the 2nd place system) and attaining competitive generalization performance in adaptation to the unseen MultiWOZ dataset.
arXiv Detail & Related papers (2020-03-03T18:07:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.