Planning Like Human: A Dual-process Framework for Dialogue Planning
- URL: http://arxiv.org/abs/2406.05374v1
- Date: Sat, 8 Jun 2024 06:52:47 GMT
- Title: Planning Like Human: A Dual-process Framework for Dialogue Planning
- Authors: Tao He, Lizi Liao, Yixin Cao, Yuanxing Liu, Ming Liu, Zerui Chen, Bing Qin,
- Abstract summary: We propose the Dual-Process Dialogue Planning framework to enhance dialogue planning in Large Language Models (LLMs)
Inspired by the dualprocess theory in psychology, we propose the framework, which embodies two modes of thinking: intuitive (fast) and analytical (slow)
Our empirical evaluations affirm DPDP's superiority in achieving both high-quality dialogues and operational efficiency, outpacing existing methods.
- Score: 31.995557540062553
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In proactive dialogue, the challenge lies not just in generating responses but in steering conversations toward predetermined goals, a task where Large Language Models (LLMs) typically struggle due to their reactive nature. Traditional approaches to enhance dialogue planning in LLMs, ranging from elaborate prompt engineering to the integration of policy networks, either face efficiency issues or deliver suboptimal performance. Inspired by the dualprocess theory in psychology, which identifies two distinct modes of thinking - intuitive (fast) and analytical (slow), we propose the Dual-Process Dialogue Planning (DPDP) framework. DPDP embodies this theory through two complementary planning systems: an instinctive policy model for familiar contexts and a deliberative Monte Carlo Tree Search (MCTS) mechanism for complex, novel scenarios. This dual strategy is further coupled with a novel two-stage training regimen: offline Reinforcement Learning for robust initial policy model formation followed by MCTS-enhanced on-the-fly learning, which ensures a dynamic balance between efficiency and strategic depth. Our empirical evaluations across diverse dialogue tasks affirm DPDP's superiority in achieving both high-quality dialogues and operational efficiency, outpacing existing methods.
Related papers
- Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm.
HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies.
HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z) - Strength Lies in Differences! Improving Strategy Planning for Non-collaborative Dialogues via Diversified User Simulation [69.5677514160986]
We investigate non-collaborative dialogue agents, which are expected to engage in strategic conversations with diverse users.
This poses two main challenges for existing dialogue agents.
We propose Trip to enhance the capability in tailored strategic planning, incorporating a user-aware strategic planning module and a population-based training paradigm.
arXiv Detail & Related papers (2024-03-11T14:38:16Z) - Target-constrained Bidirectional Planning for Generation of
Target-oriented Proactive Dialogue [11.338393954848632]
We focus on effective dialogue planning for target-oriented dialogue generation.
Inspired by decision-making theories in cognitive science, we propose a novel target-constrained bidirectional planning approach.
Our algorithms significantly outperform various baseline models.
arXiv Detail & Related papers (2024-03-10T02:14:24Z) - Plug-and-Play Policy Planner for Large Language Model Powered Dialogue
Agents [121.46051697742608]
We introduce a new dialogue policy planning paradigm to strategize dialogue problems with a tunable language model plug-in named PPDPP.
Specifically, we develop a novel training framework to facilitate supervised fine-tuning over available human-annotated data.
PPDPP consistently and substantially outperforms existing approaches on three different proactive dialogue applications.
arXiv Detail & Related papers (2023-11-01T03:20:16Z) - Self-Explanation Prompting Improves Dialogue Understanding in Large
Language Models [52.24756457516834]
We propose a novel "Self-Explanation" prompting strategy to enhance the comprehension abilities of Large Language Models (LLMs)
This task-agnostic approach requires the model to analyze each dialogue utterance before task execution, thereby improving performance across various dialogue-centric tasks.
Experimental results from six benchmark datasets confirm that our method consistently outperforms other zero-shot prompts and matches or exceeds the efficacy of few-shot prompts.
arXiv Detail & Related papers (2023-09-22T15:41:34Z) - JoTR: A Joint Transformer and Reinforcement Learning Framework for
Dialog Policy Learning [53.83063435640911]
Dialogue policy learning (DPL) is a crucial component of dialogue modelling.
We introduce a novel framework, JoTR, to generate flexible dialogue actions.
Unlike traditional methods, JoTR formulates a word-level policy that allows for a more dynamic and adaptable dialogue action generation.
arXiv Detail & Related papers (2023-09-01T03:19:53Z) - Multi-Stage Coarse-to-Fine Contrastive Learning for Conversation Intent
Induction [34.25242109800481]
This paper presents our solution to Track 2 of Intent Induction from Conversations for Task-Oriented Dialogue at the Eleventh Dialogue System Technology Challenge (DSTC11)
The essence of intention clustering lies in distinguishing the representation of different dialogue utterances.
In the released DSTC11 evaluation results, our proposed system ranked first on both of the two subtasks of this Track.
arXiv Detail & Related papers (2023-03-09T04:51:27Z) - Modelling Hierarchical Structure between Dialogue Policy and Natural
Language Generator with Option Framework for Task-oriented Dialogue System [49.39150449455407]
HDNO is an option framework for designing latent dialogue acts to avoid designing specific dialogue act representations.
We test HDNO on MultiWoz 2.0 and MultiWoz 2.1, the datasets on multi-domain dialogues, in comparison with word-level E2E model trained with RL, LaRL and HDSA.
arXiv Detail & Related papers (2020-06-11T20:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.