Plug-and-Play Policy Planner for Large Language Model Powered Dialogue
Agents
- URL: http://arxiv.org/abs/2311.00262v2
- Date: Mon, 11 Mar 2024 08:30:31 GMT
- Title: Plug-and-Play Policy Planner for Large Language Model Powered Dialogue
Agents
- Authors: Yang Deng, Wenxuan Zhang, Wai Lam, See-Kiong Ng, Tat-Seng Chua
- Abstract summary: We introduce a new dialogue policy planning paradigm to strategize dialogue problems with a tunable language model plug-in named PPDPP.
Specifically, we develop a novel training framework to facilitate supervised fine-tuning over available human-annotated data.
PPDPP consistently and substantially outperforms existing approaches on three different proactive dialogue applications.
- Score: 121.46051697742608
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Proactive dialogues serve as a practical yet challenging dialogue problem in
the era of large language models (LLMs), where the dialogue policy planning is
the key to improving the proactivity of LLMs. Most existing studies enable the
dialogue policy planning of LLMs using various prompting schemes or iteratively
enhance this capability in handling the given case with verbal AI feedback.
However, these approaches are either bounded by the policy planning capability
of the frozen LLMs or hard to be transferred to new cases. In this work, we
introduce a new dialogue policy planning paradigm to strategize LLMs for
proactive dialogue problems with a tunable language model plug-in as a
plug-and-play dialogue policy planner, named PPDPP. Specifically, we develop a
novel training framework to facilitate supervised fine-tuning over available
human-annotated data as well as reinforcement learning from goal-oriented AI
feedback with dynamic interaction data collected by the LLM-based self-play
simulation. In this manner, the LLM-powered dialogue agent can not only be
generalized to different cases after the training, but also be applicable to
different applications by just substituting the learned plug-in. In addition,
we propose to evaluate the policy planning capability of dialogue systems under
the interactive setting. Experimental results demonstrate that PPDPP
consistently and substantially outperforms existing approaches on three
different proactive dialogue applications, including negotiation, emotional
support, and tutoring dialogues.
Related papers
- Planning with Large Language Models for Conversational Agents [51.12859325330882]
Controllability and proactivity are crucial properties of autonomous conversational agents (CAs)
We propose a new framework for planning-based conversational agents powered by large language models (LLMs)
Experiment results show that LLMs finetuned on PCA-D can significantly improve the performance and generalize to unseen domains.
arXiv Detail & Related papers (2024-07-04T12:23:02Z) - Unsupervised Extraction of Dialogue Policies from Conversations [3.102576158218633]
We show how Large Language Models can be instrumental in extracting dialogue policies from datasets.
We then propose a novel method for generating dialogue policies utilizing a controllable and interpretable graph-based methodology.
arXiv Detail & Related papers (2024-06-21T14:57:25Z) - Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations [70.7884839812069]
Large language models (LLMs) have emerged as powerful and general solutions to many natural language tasks.
However, many of the most important applications of language generation are interactive, where an agent has to talk to a person to reach a desired outcome.
In this work, we explore a new method for adapting LLMs with RL for such goal-directed dialogue.
arXiv Detail & Related papers (2023-11-09T18:45:16Z) - Self-Explanation Prompting Improves Dialogue Understanding in Large
Language Models [52.24756457516834]
We propose a novel "Self-Explanation" prompting strategy to enhance the comprehension abilities of Large Language Models (LLMs)
This task-agnostic approach requires the model to analyze each dialogue utterance before task execution, thereby improving performance across various dialogue-centric tasks.
Experimental results from six benchmark datasets confirm that our method consistently outperforms other zero-shot prompts and matches or exceeds the efficacy of few-shot prompts.
arXiv Detail & Related papers (2023-09-22T15:41:34Z) - Prompting and Evaluating Large Language Models for Proactive Dialogues:
Clarification, Target-guided, and Non-collaboration [72.04629217161656]
This work focuses on three aspects of proactive dialogue systems: clarification, target-guided, and non-collaborative dialogues.
To trigger the proactivity of LLMs, we propose the Proactive Chain-of-Thought prompting scheme.
arXiv Detail & Related papers (2023-05-23T02:49:35Z) - "Think Before You Speak": Improving Multi-Action Dialog Policy by
Planning Single-Action Dialogs [33.78889030078026]
Multi-action dialog policy (MADP) generates multiple atomic dialog actions per turn.
We propose Planning Enhanced Dialog Policy (PEDP), a novel multi-task learning framework that learns single-action dialog dynamics.
Our fully supervised learning-based method achieves a solid task success rate of 90.6%, improving 3% compared to the state-of-the-art methods.
arXiv Detail & Related papers (2022-04-25T07:55:53Z) - Dialogue-oriented Pre-training [70.03028879331339]
We propose three strategies to simulate the conversation features on general plain text.
Dialog-PrLM is fine-tuned on three public multi-turn dialogue datasets.
arXiv Detail & Related papers (2021-06-01T12:02:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.