Related papers: Unsupervised End-to-End Task-Oriented Dialogue with LLMs: The Power of the Noisy Channel

Unsupervised End-to-End Task-Oriented Dialogue with LLMs: The Power of the Noisy Channel

URL: http://arxiv.org/abs/2404.15219v2
Date: Wed, 16 Oct 2024 16:01:59 GMT
Title: Unsupervised End-to-End Task-Oriented Dialogue with LLMs: The Power of the Noisy Channel
Authors: Brendan King, Jeffrey Flanigan,
Abstract summary: Training task-oriented dialogue systems typically require turn-level annotations for interacting with their APIs. Unlabeled data and a schema definition are sufficient for building a working task-oriented dialogue system, completely unsupervised. We propose an innovative approach using expectation-maximization (EM) that infers turn-level annotations as latent variables.
Score: 9.082443585886127
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Training task-oriented dialogue systems typically requires turn-level annotations for interacting with their APIs: e.g. a dialogue state and the system actions taken at each step. These annotations can be costly to produce, error-prone, and require both domain and annotation expertise. With advances in LLMs, we hypothesize that unlabeled data and a schema definition are sufficient for building a working task-oriented dialogue system, completely unsupervised. We consider a novel unsupervised setting of only (1) a well-defined API schema (2) a set of unlabeled dialogues between a user and agent. We propose an innovative approach using expectation-maximization (EM) that infers turn-level annotations as latent variables using a noisy channel model to build an end-to-end dialogue agent. Evaluating our approach on the MultiWOZ benchmark, our method more than doubles the dialogue success rate of a strong GPT-3.5 baseline.

Related papers

Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities [93.09944267871163]
FullDuplexBench is a benchmark that systematically evaluates key conversational behaviors. We aim to advance spoken dialogue modeling and encourage the development of more interactive and natural dialogue systems.
arXiv Detail & Related papers (2025-03-06T18:59:16Z)
Evaluating and Enhancing Out-of-Domain Generalization of Task-Oriented Dialog Systems for Task Completion without Turn-level Dialog Annotations [2.453775887722866]
This work explores whether large language models (LLMs) can be fine-tuned solely on natural language dialogs to perform ToD tasks, without requiring such annotations. We find that models fine-tuned without turn-level annotations generate coherent and contextually appropriate responses. We propose ZeroToD, a framework that incorporates a schema augmentation mechanism to enhance API call accuracy and overall task completion rates.
arXiv Detail & Related papers (2025-02-18T22:10:51Z)
Planning with Large Language Models for Conversational Agents [51.12859325330882]
Controllability and proactivity are crucial properties of autonomous conversational agents (CAs) We propose a new framework for planning-based conversational agents powered by large language models (LLMs) Experiment results show that LLMs finetuned on PCA-D can significantly improve the performance and generalize to unseen domains.
arXiv Detail & Related papers (2024-07-04T12:23:02Z)
DiactTOD: Learning Generalizable Latent Dialogue Acts for Controllable Task-Oriented Dialogue Systems [15.087619144902776]
We present a novel end-to-end latent dialogue act model (DiactTOD) that represents dialogue acts in a latent space. When pre-trained on a large corpus, DiactTOD is able to predict and control dialogue acts to generate controllable responses.
arXiv Detail & Related papers (2023-08-01T23:29:16Z)
Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue Questions with LLMs [59.74002011562726]
We propose a novel linguistic cue-based chain-of-thoughts (textitCue-CoT) to provide a more personalized and engaging response. We build a benchmark with in-depth dialogue questions, consisting of 6 datasets in both Chinese and English. Empirical results demonstrate our proposed textitCue-CoT method outperforms standard prompting methods in terms of both textithelpfulness and textitacceptability on all datasets.
arXiv Detail & Related papers (2023-05-19T16:27:43Z)
Zero-Shot Generalizable End-to-End Task-Oriented Dialog System using Context Summarization and Domain Schema [2.7178968279054936]
State-of-the-art approaches in task-oriented dialog systems formulate the problem as a conditional sequence generation task. This requires labeled training data for each new domain or task. We introduce a novel Zero-Shot generalizable end-to-end Task-oriented Dialog system, ZS-ToD.
arXiv Detail & Related papers (2023-03-28T18:56:31Z)
SPACE-2: Tree-Structured Semi-Supervised Contrastive Pre-training for Task-Oriented Dialog Understanding [68.94808536012371]
We propose a tree-structured pre-trained conversation model, which learns dialog representations from limited labeled dialogs and large-scale unlabeled dialog corpora. Our method can achieve new state-of-the-art results on the DialoGLUE benchmark consisting of seven datasets and four popular dialog understanding tasks.
arXiv Detail & Related papers (2022-09-14T13:42:50Z)
Manual-Guided Dialogue for Flexible Conversational Agents [84.46598430403886]
How to build and use dialogue data efficiently, and how to deploy models in different domains at scale can be critical issues in building a task-oriented dialogue system. We propose a novel manual-guided dialogue scheme, where the agent learns the tasks from both dialogue and manuals. Our proposed scheme reduces the dependence of dialogue models on fine-grained domain ontology, and makes them more flexible to adapt to various domains.
arXiv Detail & Related papers (2022-08-16T08:21:12Z)
FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows [63.116280145770006]
We propose segment act, an extension of dialog act from utterance level to segment level, and crowdsource a large-scale dataset for it. To utilize segment act flows, sequences of segment acts, for evaluation, we develop the first consensus-based dialogue evaluation framework, FlowEval.
arXiv Detail & Related papers (2022-02-14T11:37:20Z)
Alexa Conversations: An Extensible Data-driven Approach for Building Task-oriented Dialogue Systems [21.98135285833616]
Traditional goal-oriented dialogue systems rely on various components such as natural language understanding, dialogue state tracking, policy learning and response generation. We present a new approach for building goal-oriented dialogue systems that is scalable, as well as data efficient.
arXiv Detail & Related papers (2021-04-19T07:09:27Z)
Attention Guided Dialogue State Tracking with Sparse Supervision [5.758073912084366]
In call centers, for tasks like managing bookings or subscriptions, the user goal can be associated with actions issued by customer service agents. These action logs are available in large volumes and can be utilized for learning dialogue states. We extend a state-of-the-art encoder-decoder model to efficiently learn Dialogue State Tracking (DST) with sparse labels.
arXiv Detail & Related papers (2021-01-28T12:18:39Z)
A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning [22.757971831442426]
Training belief trackers often requires expensive turn-level annotations of every user utterance. We propose a probabilistic dialog model, called the LAtent BElief State (LABES) model, where belief states are represented as discrete latent variables. We introduce LABES-S2S, which is a copy-augmented Seq2Seq model instantiation of LABES.
arXiv Detail & Related papers (2020-09-17T07:26:37Z)
Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue System [49.39150449455407]
HDNO is an option framework for designing latent dialogue acts to avoid designing specific dialogue act representations. We test HDNO on MultiWoz 2.0 and MultiWoz 2.1, the datasets on multi-domain dialogues, in comparison with word-level E2E model trained with RL, LaRL and HDSA.
arXiv Detail & Related papers (2020-06-11T20:55:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.