A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief
States towards Semi-Supervised Learning
- URL: http://arxiv.org/abs/2009.08115v3
- Date: Tue, 13 Oct 2020 14:18:09 GMT
- Title: A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief
States towards Semi-Supervised Learning
- Authors: Yichi Zhang, Zhijian Ou, Huixin Wang, Junlan Feng
- Abstract summary: Training belief trackers often requires expensive turn-level annotations of every user utterance.
We propose a probabilistic dialog model, called the LAtent BElief State (LABES) model, where belief states are represented as discrete latent variables.
We introduce LABES-S2S, which is a copy-augmented Seq2Seq model instantiation of LABES.
- Score: 22.757971831442426
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Structured belief states are crucial for user goal tracking and database
query in task-oriented dialog systems. However, training belief trackers often
requires expensive turn-level annotations of every user utterance. In this
paper we aim at alleviating the reliance on belief state labels in building
end-to-end dialog systems, by leveraging unlabeled dialog data towards
semi-supervised learning. We propose a probabilistic dialog model, called the
LAtent BElief State (LABES) model, where belief states are represented as
discrete latent variables and jointly modeled with system responses given user
inputs. Such latent variable modeling enables us to develop semi-supervised
learning under the principled variational learning framework. Furthermore, we
introduce LABES-S2S, which is a copy-augmented Seq2Seq model instantiation of
LABES. In supervised experiments, LABES-S2S obtains strong results on three
benchmark datasets of different scales. In utilizing unlabeled dialog data,
semi-supervised LABES-S2S significantly outperforms both supervised-only and
semi-supervised baselines. Remarkably, we can reduce the annotation demands to
50% without performance loss on MultiWOZ.
Related papers
- Unsupervised End-to-End Task-Oriented Dialogue with LLMs: The Power of the Noisy Channel [9.082443585886127]
Training task-oriented dialogue systems typically require turn-level annotations for interacting with their APIs.
Unlabeled data and a schema definition are sufficient for building a working task-oriented dialogue system, completely unsupervised.
We propose an innovative approach using expectation-maximization (EM) that infers turn-level annotations as latent variables.
arXiv Detail & Related papers (2024-04-23T16:51:26Z) - Automated Evaluation of Classroom Instructional Support with LLMs and BoWs: Connecting Global Predictions to Specific Feedback [9.51494089949975]
Large Language Models (LLMs) can be used to estimate Instructional Support'' domain scores of the CLassroom Assessment Scoring System (CLASS)
We design a machine learning architecture that uses either zero-shot prompting of MetaPearson's Llama2, and/or a classic Bag of Words (BoW) model, to classify individual utterances of teachers' speech.
These utterance-level judgments are aggregated over a 15-min observation session to estimate a global CLASS score.
arXiv Detail & Related papers (2023-10-02T12:11:17Z) - CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog
Evaluation [75.60156479374416]
CGoDial is a new challenging and comprehensive Chinese benchmark for Goal-oriented Dialog evaluation.
It contains 96,763 dialog sessions and 574,949 dialog turns totally, covering three datasets with different knowledge sources.
To bridge the gap between academic benchmarks and spoken dialog scenarios, we either collect data from real conversations or add spoken features to existing datasets via crowd-sourcing.
arXiv Detail & Related papers (2022-11-21T16:21:41Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - SPACE-2: Tree-Structured Semi-Supervised Contrastive Pre-training for
Task-Oriented Dialog Understanding [68.94808536012371]
We propose a tree-structured pre-trained conversation model, which learns dialog representations from limited labeled dialogs and large-scale unlabeled dialog corpora.
Our method can achieve new state-of-the-art results on the DialoGLUE benchmark consisting of seven datasets and four popular dialog understanding tasks.
arXiv Detail & Related papers (2022-09-14T13:42:50Z) - Variational Latent-State GPT for Semi-supervised Task-Oriented Dialog
Systems [24.667353107453824]
Variational Latent-State GPT model (VLS-GPT) is the first to combine the strengths of the two approaches.
We develop the strategy of sampling-then-forward-computation, which successfully overcomes the memory explosion issue of using GPT in variational learning.
VLS-GPT is shown to significantly outperform both supervised-only and semi-supervised baselines.
arXiv Detail & Related papers (2021-09-09T14:42:29Z) - RADDLE: An Evaluation Benchmark and Analysis Platform for Robust
Task-oriented Dialog Systems [75.87418236410296]
We introduce the RADDLE benchmark, a collection of corpora and tools for evaluating the performance of models across a diverse set of domains.
RADDLE is designed to favor and encourage models with a strong generalization ability.
We evaluate recent state-of-the-art systems based on pre-training and fine-tuning, and find that grounded pre-training on heterogeneous dialog corpora performs better than training a separate model per domain.
arXiv Detail & Related papers (2020-12-29T08:58:49Z) - MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems [75.43457658815943]
We propose Minimalist Transfer Learning (MinTL) to simplify the system design process of task-oriented dialogue systems.
MinTL is a simple yet effective transfer learning framework, which allows us to plug-and-play pre-trained seq2seq models.
We instantiate our learning framework with two pre-trained backbones: T5 and BART, and evaluate them on MultiWOZ.
arXiv Detail & Related papers (2020-09-25T02:19:13Z) - A Fast and Robust BERT-based Dialogue State Tracker for Schema-Guided
Dialogue Dataset [8.990035371365408]
We introduce FastSGT, a fast and robust BERT-based model for state tracking in goal-oriented dialogue systems.
The proposed model is designed for theGuided Dialogue dataset which contains natural language descriptions.
Our model keeps the efficiency in terms of computational and memory consumption while improving the accuracy significantly.
arXiv Detail & Related papers (2020-08-27T18:51:18Z) - Modelling Hierarchical Structure between Dialogue Policy and Natural
Language Generator with Option Framework for Task-oriented Dialogue System [49.39150449455407]
HDNO is an option framework for designing latent dialogue acts to avoid designing specific dialogue act representations.
We test HDNO on MultiWoz 2.0 and MultiWoz 2.1, the datasets on multi-domain dialogues, in comparison with word-level E2E model trained with RL, LaRL and HDSA.
arXiv Detail & Related papers (2020-06-11T20:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.