Distributed Structured Actor-Critic Reinforcement Learning for Universal
Dialogue Management
- URL: http://arxiv.org/abs/2009.10326v1
- Date: Tue, 22 Sep 2020 05:39:31 GMT
- Title: Distributed Structured Actor-Critic Reinforcement Learning for Universal
Dialogue Management
- Authors: Zhi Chen, Lu Chen, Xiaoyuan Liu, and Kai Yu
- Abstract summary: We focus on devising a policy that chooses which dialogue action to respond to the user.
The sequential system decision-making process can be abstracted into a partially observable Markov decision process.
In the past few years, there are many deep reinforcement learning (DRL) algorithms, which use neural networks (NN) as function approximators.
- Score: 29.57382819573169
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The task-oriented spoken dialogue system (SDS) aims to assist a human user in
accomplishing a specific task (e.g., hotel booking). The dialogue management is
a core part of SDS. There are two main missions in dialogue management:
dialogue belief state tracking (summarising conversation history) and dialogue
decision-making (deciding how to reply to the user). In this work, we only
focus on devising a policy that chooses which dialogue action to respond to the
user. The sequential system decision-making process can be abstracted into a
partially observable Markov decision process (POMDP). Under this framework,
reinforcement learning approaches can be used for automated policy
optimization. In the past few years, there are many deep reinforcement learning
(DRL) algorithms, which use neural networks (NN) as function approximators,
investigated for dialogue policy.
Related papers
- OmniDialog: An Omnipotent Pre-training Model for Task-Oriented Dialogue
System [43.92593448255296]
We propose an Omnipotent Dialogue pre-training model ( OmniDialog)
It unifies three dialogue tasks into a monolithic framework by multi-task learning, fostering inter-task communication.
We evaluate its performance across four tasks: dialogue summarization, end-to-end dialogue modeling, dialogue state tracking, and intent classification.
arXiv Detail & Related papers (2023-12-28T07:20:49Z) - Plug-and-Play Policy Planner for Large Language Model Powered Dialogue
Agents [121.46051697742608]
We introduce a new dialogue policy planning paradigm to strategize dialogue problems with a tunable language model plug-in named PPDPP.
Specifically, we develop a novel training framework to facilitate supervised fine-tuning over available human-annotated data.
PPDPP consistently and substantially outperforms existing approaches on three different proactive dialogue applications.
arXiv Detail & Related papers (2023-11-01T03:20:16Z) - Why Guided Dialog Policy Learning performs well? Understanding the role
of adversarial learning and its alternative [0.44267358790081573]
In recent years, reinforcement learning has emerged as a promising option for dialog policy learning (DPL)
One way to estimate rewards from collected data is to train the reward estimator and dialog policy simultaneously using adversarial learning (AL)
This paper identifies the role of AL in DPL through detailed analyses of the objective functions of dialog policy and reward estimator.
We propose a method that eliminates AL from reward estimation and DPL while retaining its advantages.
arXiv Detail & Related papers (2023-07-13T12:29:29Z) - Dialog-to-Actions: Building Task-Oriented Dialogue System via
Action-Level Generation [7.110201160927713]
We propose a task-oriented dialogue system via action-level generation.
Specifically, we first construct dialogue actions from large-scale dialogues and represent each natural language (NL) response as a sequence of dialogue actions.
We train a Sequence-to-Sequence model which takes the dialogue history as input and outputs sequence of dialogue actions.
arXiv Detail & Related papers (2023-04-03T11:09:20Z) - User Satisfaction Estimation with Sequential Dialogue Act Modeling in
Goal-oriented Conversational Systems [65.88679683468143]
We propose a novel framework, namely USDA, to incorporate the sequential dynamics of dialogue acts for predicting user satisfaction.
USDA incorporates the sequential transitions of both content and act features in the dialogue to predict the user satisfaction.
Experimental results on four benchmark goal-oriented dialogue datasets show that the proposed method substantially and consistently outperforms existing methods on USE.
arXiv Detail & Related papers (2022-02-07T02:50:07Z) - UniDS: A Unified Dialogue System for Chit-Chat and Task-oriented
Dialogues [59.499965460525694]
We propose a unified dialogue system (UniDS) with the two aforementioned skills.
We design a unified dialogue data schema, compatible for both chit-chat and task-oriented dialogues.
We train UniDS with mixed dialogue data from a pretrained chit-chat dialogue model.
arXiv Detail & Related papers (2021-10-15T11:56:47Z) - Integrating Pre-trained Model into Rule-based Dialogue Management [32.90885176553305]
Rule-based dialogue management is still the most popular solution for industrial task-oriented dialogue systems.
Data-driven dialogue systems, usually with end-to-end structures, are popular in academic research.
We propose a method to leverage the strength of both rule-based and data-driven dialogue managers.
arXiv Detail & Related papers (2021-02-17T03:44:22Z) - Rethinking Dialogue State Tracking with Reasoning [76.0991910623001]
This paper proposes to track dialogue states gradually with reasoning over dialogue turns with the help of the back-end data.
Empirical results demonstrate that our method significantly outperforms the state-of-the-art methods by 38.6% in terms of joint belief accuracy for MultiWOZ 2.1.
arXiv Detail & Related papers (2020-05-27T02:05:33Z) - A Survey on Dialog Management: Recent Advances and Challenges [72.52920723074638]
Dialog management (DM) is a crucial component in a task-oriented dialog system.
Recent advances and challenges within three critical topics for DM: (1) improving model scalability to facilitate dialog system modeling in new scenarios, (2) dealing with the data scarcity problem for dialog policy learning, and (3) enhancing the training efficiency to achieve better task-completion performance.
arXiv Detail & Related papers (2020-05-05T14:31:24Z) - TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling.
To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.