Few Shot Dialogue State Tracking using Meta-learning
- URL: http://arxiv.org/abs/2101.06779v2
- Date: Sat, 23 Jan 2021 21:08:04 GMT
- Title: Few Shot Dialogue State Tracking using Meta-learning
- Authors: Saket Dingliwal, Bill Gao, Sanchit Agarwal, Chien-Wei Lin, Tagyoung
Chung, Dilek Hakkani-Tur
- Abstract summary: Dialogue State Tracking (DST) forms a core component of automated systems designed for specific goals like hotel, taxi reservation, tourist information, etc.
With the increasing need to deploy such systems in new domains, solving the problem of zero/few-shot DST has become necessary.
Our proposed meta-learner is agnostic of the underlying model and hence any existing state-of-the-art DST system can improve its performance on unknown domains using our training strategy.
- Score: 3.6292310166028403
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Dialogue State Tracking (DST) forms a core component of automated chatbot
based systems designed for specific goals like hotel, taxi reservation, tourist
information, etc. With the increasing need to deploy such systems in new
domains, solving the problem of zero/few-shot DST has become necessary. There
has been a rising trend for learning to transfer knowledge from resource-rich
domains to unknown domains with minimal need for additional data. In this work,
we explore the merits of meta-learning algorithms for this transfer and hence,
propose a meta-learner D-REPTILE specific to the DST problem. With extensive
experimentation, we provide clear evidence of benefits over conventional
approaches across different domains, methods, base models, and datasets with
significant (5-25%) improvement over the baseline in a low-data setting. Our
proposed meta-learner is agnostic of the underlying model and hence any
existing state-of-the-art DST system can improve its performance on unknown
domains using our training strategy.
Related papers
- A Zero-Shot Open-Vocabulary Pipeline for Dialogue Understanding [0.0]
We propose a zero-shot, open-vocabulary system that integrates domain classification and State Tracking (DST) in a single pipeline.
Our approach includes reformulating DST as a question-answering task for less capable models and employing self-refining prompts for more adaptable ones.
arXiv Detail & Related papers (2024-09-24T08:33:41Z) - UNO-DST: Leveraging Unlabelled Data in Zero-Shot Dialogue State Tracking [54.51316566989655]
Previous zero-shot dialogue state tracking (DST) methods only apply transfer learning, ignoring unlabelled data in the target domain.
We transform zero-shot DST into few-shot DST by utilising such unlabelled data via joint and self-training methods.
We demonstrate this method's effectiveness on general language models in zero-shot scenarios, improving average joint goal accuracy by 8% across all domains in MultiWOZ.
arXiv Detail & Related papers (2023-10-16T15:16:16Z) - Choice Fusion as Knowledge for Zero-Shot Dialogue State Tracking [5.691339955497443]
zero-shot dialogue state tracking (DST) tracks user's requirements in task-oriented dialogues without training on desired domains.
We propose CoFunDST, which is trained on domain-agnostic QA datasets and directly uses candidate choices of slot-values as knowledge for zero-shot dialogue-state generation.
Our proposed model achieves outperformed joint goal accuracy compared to existing zero-shot DST approaches in most domains on the MultiWOZ 2.1.
arXiv Detail & Related papers (2023-02-25T07:32:04Z) - Schema Encoding for Transferable Dialogue State Tracking [2.5838973036257458]
Dialogue state tracking (DST) is an essential sub-task for task-oriented dialogue systems.
Recent work has focused on deep neural models for DST.
Applying them to another domain needs a new dataset.
arXiv Detail & Related papers (2022-10-05T15:53:06Z) - Prompt Learning for Few-Shot Dialogue State Tracking [75.50701890035154]
This paper focuses on how to learn a dialogue state tracking (DST) model efficiently with limited labeled data.
We design a prompt learning framework for few-shot DST, which consists of two main components: value-based prompt and inverse prompt mechanism.
Experiments show that our model can generate unseen slots and outperforms existing state-of-the-art few-shot methods.
arXiv Detail & Related papers (2022-01-15T07:37:33Z) - Unified Instance and Knowledge Alignment Pretraining for Aspect-based
Sentiment Analysis [96.53859361560505]
Aspect-based Sentiment Analysis (ABSA) aims to determine the sentiment polarity towards an aspect.
There always exists severe domain shift between the pretraining and downstream ABSA datasets.
We introduce a unified alignment pretraining framework into the vanilla pretrain-finetune pipeline.
arXiv Detail & Related papers (2021-10-26T04:03:45Z) - Learning to Generalize Unseen Domains via Memory-based Multi-Source
Meta-Learning for Person Re-Identification [59.326456778057384]
We propose the Memory-based Multi-Source Meta-Learning framework to train a generalizable model for unseen domains.
We also present a meta batch normalization layer (MetaBN) to diversify meta-test features.
Experiments demonstrate that our M$3$L can effectively enhance the generalization ability of the model for unseen domains.
arXiv Detail & Related papers (2020-12-01T11:38:16Z) - Hybrid Generative-Retrieval Transformers for Dialogue Domain Adaptation [77.62366712130196]
We present the winning entry at the fast domain adaptation task of DSTC8, a hybrid generative-retrieval model based on GPT-2 fine-tuned to the multi-domain MetaLWOz dataset.
Our model uses retrieval logic as a fallback, being SoTA on MetaLWOz in human evaluation (>4% improvement over the 2nd place system) and attaining competitive generalization performance in adaptation to the unseen MultiWOZ dataset.
arXiv Detail & Related papers (2020-03-03T18:07:42Z) - Data Techniques For Online End-to-end Speech Recognition [17.621967685914587]
Practitioners often need to build ASR systems for new use cases in a short amount of time, given limited in-domain data.
While recently developed end-to-end methods largely simplify the modeling pipelines, they still suffer from the data sparsity issue.
We explore a few simple-to-implement techniques for building online ASR systems in an end-to-end fashion, with a small amount of transcribed data in the target domain.
arXiv Detail & Related papers (2020-01-24T22:59:46Z) - Domain Adaption for Knowledge Tracing [65.86619804954283]
We propose a novel adaptable framework, namely knowledge tracing (AKT) to address the DAKT problem.
For the first aspect, we incorporate the educational characteristics (e.g., slip, guess, question texts) based on the deep knowledge tracing (DKT) to obtain a good performed knowledge tracing model.
For the second aspect, we propose and adopt three domain adaptation processes. First, we pre-train an auto-encoder to select useful source instances for target model training.
arXiv Detail & Related papers (2020-01-14T15:04:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.