"Think Before You Speak": Improving Multi-Action Dialog Policy by
Planning Single-Action Dialogs
- URL: http://arxiv.org/abs/2204.11481v1
- Date: Mon, 25 Apr 2022 07:55:53 GMT
- Title: "Think Before You Speak": Improving Multi-Action Dialog Policy by
Planning Single-Action Dialogs
- Authors: Shuo Zhang, Junzhou Zhao, Pinghui Wang, Yu Li, Yi Huang, Junlan Feng
- Abstract summary: Multi-action dialog policy (MADP) generates multiple atomic dialog actions per turn.
We propose Planning Enhanced Dialog Policy (PEDP), a novel multi-task learning framework that learns single-action dialog dynamics.
Our fully supervised learning-based method achieves a solid task success rate of 90.6%, improving 3% compared to the state-of-the-art methods.
- Score: 33.78889030078026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-action dialog policy (MADP), which generates multiple atomic dialog
actions per turn, has been widely applied in task-oriented dialog systems to
provide expressive and efficient system responses. Existing MADP models usually
imitate action combinations from the labeled multi-action dialog samples. Due
to data limitations, they generalize poorly toward unseen dialog flows. While
interactive learning and reinforcement learning algorithms can be applied to
incorporate external data sources of real users and user simulators, they take
significant manual effort to build and suffer from instability. To address
these issues, we propose Planning Enhanced Dialog Policy (PEDP), a novel
multi-task learning framework that learns single-action dialog dynamics to
enhance multi-action prediction. Our PEDP method employs model-based planning
for conceiving what to express before deciding the current response through
simulating single-action dialogs. Experimental results on the MultiWOZ dataset
demonstrate that our fully supervised learning-based method achieves a solid
task success rate of 90.6%, improving 3% compared to the state-of-the-art
methods.
Related papers
- Unsupervised Extraction of Dialogue Policies from Conversations [3.102576158218633]
We show how Large Language Models can be instrumental in extracting dialogue policies from datasets.
We then propose a novel method for generating dialogue policies utilizing a controllable and interpretable graph-based methodology.
arXiv Detail & Related papers (2024-06-21T14:57:25Z) - DialCLIP: Empowering CLIP as Multi-Modal Dialog Retriever [83.33209603041013]
We propose a parameter-efficient prompt-tuning method named DialCLIP for multi-modal dialog retrieval.
Our approach introduces a multi-modal context generator to learn context features which are distilled into prompts within the pre-trained vision-language model CLIP.
To facilitate various types of retrieval, we also design multiple experts to learn mappings from CLIP outputs to multi-modal representation space.
arXiv Detail & Related papers (2024-01-02T07:40:12Z) - Plug-and-Play Policy Planner for Large Language Model Powered Dialogue
Agents [121.46051697742608]
We introduce a new dialogue policy planning paradigm to strategize dialogue problems with a tunable language model plug-in named PPDPP.
Specifically, we develop a novel training framework to facilitate supervised fine-tuning over available human-annotated data.
PPDPP consistently and substantially outperforms existing approaches on three different proactive dialogue applications.
arXiv Detail & Related papers (2023-11-01T03:20:16Z) - Self-Explanation Prompting Improves Dialogue Understanding in Large
Language Models [52.24756457516834]
We propose a novel "Self-Explanation" prompting strategy to enhance the comprehension abilities of Large Language Models (LLMs)
This task-agnostic approach requires the model to analyze each dialogue utterance before task execution, thereby improving performance across various dialogue-centric tasks.
Experimental results from six benchmark datasets confirm that our method consistently outperforms other zero-shot prompts and matches or exceeds the efficacy of few-shot prompts.
arXiv Detail & Related papers (2023-09-22T15:41:34Z) - JoTR: A Joint Transformer and Reinforcement Learning Framework for
Dialog Policy Learning [53.83063435640911]
Dialogue policy learning (DPL) is a crucial component of dialogue modelling.
We introduce a novel framework, JoTR, to generate flexible dialogue actions.
Unlike traditional methods, JoTR formulates a word-level policy that allows for a more dynamic and adaptable dialogue action generation.
arXiv Detail & Related papers (2023-09-01T03:19:53Z) - Leveraging Explicit Procedural Instructions for Data-Efficient Action
Prediction [5.448684866061922]
Task-oriented dialogues often require agents to enact complex, multi-step procedures in order to meet user requests.
Large language models have found success automating these dialogues in constrained environments, but their widespread deployment is limited by the substantial quantities of task-specific data required for training.
This paper presents a data-efficient solution to constructing dialogue systems, leveraging explicit instructions derived from agent guidelines.
arXiv Detail & Related papers (2023-06-06T18:42:08Z) - Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - EM Pre-training for Multi-party Dialogue Response Generation [86.25289241604199]
In multi-party dialogues, the addressee of a response utterance should be specified before it is generated.
We propose an Expectation-Maximization (EM) approach that iteratively performs the expectation steps to generate addressee labels.
arXiv Detail & Related papers (2023-05-21T09:22:41Z) - GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with
Semi-Supervised Learning and Explicit Policy Injection [36.77204909711832]
We propose a novel pre-trained dialog model that explicitly learns dialog policy from limited labeled dialogs and large-scale unlabeled dialog corpora.
Specifically, we introduce a dialog act prediction task for policy optimization during pre-training and employ a consistency regularization term to refine the learned representation.
Empirical results show that GALAXY substantially improves the performance of task-oriented dialog systems.
arXiv Detail & Related papers (2021-11-29T15:24:36Z) - Retrieve & Memorize: Dialog Policy Learning with Multi-Action Memory [13.469140432108151]
We propose a retrieve-and-memorize framework to enhance the learning of system actions.
We use a memory-augmented multi-decoder network to generate the system actions conditioned on the candidate actions.
Our method achieves competitive performance among several state-of-the-art models in the context-to-response generation task.
arXiv Detail & Related papers (2021-06-04T07:53:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.