A Survey on Recent Advances and Challenges in Reinforcement
LearningMethods for Task-Oriented Dialogue Policy Learning
- URL: http://arxiv.org/abs/2202.13675v1
- Date: Mon, 28 Feb 2022 10:50:22 GMT
- Title: A Survey on Recent Advances and Challenges in Reinforcement
LearningMethods for Task-Oriented Dialogue Policy Learning
- Authors: Wai-Chung Kwan, Hongru Wang, Huimin Wang, Kam-Fai Wong
- Abstract summary: Reinforcement Learning (RL) is commonly chosen to learn the dialogue policy, regarding the user as the environment and the system as the agent.
In this paper, we survey recent advances and challenges in dialogue policy from the prescriptive of RL.
We provide a comprehensive survey of applying RL to dialogue policy learning by categorizing recent methods into basic elements in RL.
- Score: 16.545577313042827
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dialogue Policy Learning is a key component in a task-oriented dialogue
system (TDS) that decides the next action of the system given the dialogue
state at each turn. Reinforcement Learning (RL) is commonly chosen to learn the
dialogue policy, regarding the user as the environment and the system as the
agent. Many benchmark datasets and algorithms have been created to facilitate
the development and evaluation of dialogue policy based on RL. In this paper,
we survey recent advances and challenges in dialogue policy from the
prescriptive of RL. More specifically, we identify the major problems and
summarize corresponding solutions for RL-based dialogue policy learning.
Besides, we provide a comprehensive survey of applying RL to dialogue policy
learning by categorizing recent methods into basic elements in RL. We believe
this survey can shed a light on future research in dialogue management.
Related papers
- A Survey on Recent Advances in LLM-Based Multi-turn Dialogue Systems [12.999001024463453]
This paper aims to give a summary of existing LLMs and approaches for adapting LLMs to downstream tasks.
It elaborates recent advances in multi-turn dialogue systems, covering both LLM-based open-domain dialogue (ODD) and task-oriented dialogue (TOD) systems.
arXiv Detail & Related papers (2024-02-28T03:16:44Z) - Plug-and-Play Policy Planner for Large Language Model Powered Dialogue
Agents [121.46051697742608]
We introduce a new dialogue policy planning paradigm to strategize dialogue problems with a tunable language model plug-in named PPDPP.
Specifically, we develop a novel training framework to facilitate supervised fine-tuning over available human-annotated data.
PPDPP consistently and substantially outperforms existing approaches on three different proactive dialogue applications.
arXiv Detail & Related papers (2023-11-01T03:20:16Z) - Why Guided Dialog Policy Learning performs well? Understanding the role
of adversarial learning and its alternative [0.44267358790081573]
In recent years, reinforcement learning has emerged as a promising option for dialog policy learning (DPL)
One way to estimate rewards from collected data is to train the reward estimator and dialog policy simultaneously using adversarial learning (AL)
This paper identifies the role of AL in DPL through detailed analyses of the objective functions of dialog policy and reward estimator.
We propose a method that eliminates AL from reward estimation and DPL while retaining its advantages.
arXiv Detail & Related papers (2023-07-13T12:29:29Z) - Prompting and Evaluating Large Language Models for Proactive Dialogues:
Clarification, Target-guided, and Non-collaboration [72.04629217161656]
This work focuses on three aspects of proactive dialogue systems: clarification, target-guided, and non-collaborative dialogues.
To trigger the proactivity of LLMs, we propose the Proactive Chain-of-Thought prompting scheme.
arXiv Detail & Related papers (2023-05-23T02:49:35Z) - A Survey on Proactive Dialogue Systems: Problems, Methods, and Prospects [100.75759050696355]
We provide a comprehensive overview of the prominent problems and advanced designs for conversational agent's proactivity in different types of dialogues.
We discuss challenges that meet the real-world application needs but require a greater research focus in the future.
arXiv Detail & Related papers (2023-05-04T11:38:49Z) - Distributed Structured Actor-Critic Reinforcement Learning for Universal
Dialogue Management [29.57382819573169]
We focus on devising a policy that chooses which dialogue action to respond to the user.
The sequential system decision-making process can be abstracted into a partially observable Markov decision process.
In the past few years, there are many deep reinforcement learning (DRL) algorithms, which use neural networks (NN) as function approximators.
arXiv Detail & Related papers (2020-09-22T05:39:31Z) - A Survey on Dialog Management: Recent Advances and Challenges [72.52920723074638]
Dialog management (DM) is a crucial component in a task-oriented dialog system.
Recent advances and challenges within three critical topics for DM: (1) improving model scalability to facilitate dialog system modeling in new scenarios, (2) dealing with the data scarcity problem for dialog policy learning, and (3) enhancing the training efficiency to achieve better task-completion performance.
arXiv Detail & Related papers (2020-05-05T14:31:24Z) - Guided Dialog Policy Learning without Adversarial Learning in the Loop [103.20723982440788]
A number of adversarial learning methods have been proposed to learn the reward function together with the dialogue policy.
We propose to decompose the adversarial training into two steps.
First, we train the discriminator with an auxiliary dialogue generator and then incorporate a derived reward model into a common RL method to guide the dialogue policy learning.
arXiv Detail & Related papers (2020-04-07T11:03:17Z) - Recent Advances and Challenges in Task-oriented Dialog System [63.82055978899631]
Task-oriented dialog systems are attracting more and more attention in academic and industrial communities.
We discuss three critical topics for task-oriented dialog systems: (1) improving data efficiency to facilitate dialog modeling in low-resource settings, (2) modeling multi-turn dynamics for dialog policy learning, and (3) integrating domain knowledge into the dialog model.
arXiv Detail & Related papers (2020-03-17T01:34:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.