Related papers: Prompt-Based Monte-Carlo Tree Search for Goal-Oriented Dialogue Policy Planning

Prompt-Based Monte-Carlo Tree Search for Goal-Oriented Dialogue Policy Planning

URL: http://arxiv.org/abs/2305.13660v2
Date: Thu, 19 Oct 2023 22:31:18 GMT
Title: Prompt-Based Monte-Carlo Tree Search for Goal-Oriented Dialogue Policy Planning
Authors: Xiao Yu, Maximillian Chen, Zhou Yu
Abstract summary: GDP-Zero is an approach using Open-Loop MCTS to perform goal-oriented dialogue policy planning without any model training. We evaluate GDP-Zero on the goal-oriented task PersuasionForGood, and find that its responses are preferred over ChatGPT up to 59.32% of the time.
Score: 22.753613264491918
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Planning for goal-oriented dialogue often requires simulating future dialogue interactions and estimating task progress. Many approaches thus consider training neural networks to perform look-ahead search algorithms such as A* search and Monte Carlo Tree Search (MCTS). However, this training often requires abundant annotated data, which creates challenges when faced with noisy annotations or low-resource settings. We introduce GDP-Zero, an approach using Open-Loop MCTS to perform goal-oriented dialogue policy planning without any model training. GDP-Zero prompts a large language model to act as a policy prior, value function, user simulator, and system model during the tree search. We evaluate GDP-Zero on the goal-oriented task PersuasionForGood, and find that its responses are preferred over ChatGPT up to 59.32% of the time, and are rated more persuasive than ChatGPT during interactive evaluations.

Related papers

Towards Zero-Shot, Controllable Dialog Planning with LLMs [28.392036110582723]
Large Language Models (LLMs) have emerged as an alternative to training task-specific dialog agents. This paper introduces a novel zero-shot method for controllable Conversational Tree Search (CTS) agents.
arXiv Detail & Related papers (2024-10-08T08:51:44Z)
Planning with Large Language Models for Conversational Agents [51.12859325330882]
Controllability and proactivity are crucial properties of autonomous conversational agents (CAs) We propose a new framework for planning-based conversational agents powered by large language models (LLMs) Experiment results show that LLMs finetuned on PCA-D can significantly improve the performance and generalize to unseen domains.
arXiv Detail & Related papers (2024-07-04T12:23:02Z)
Response Enhanced Semi-supervised Dialogue Query Generation [40.17161986495854]
We propose a semi-supervised learning framework -- SemiDQG -- to improve model performance with unlabeled conversations. We first apply a similarity-based query selection strategy to select high-quality RA-generated pseudo queries. We adopt the REINFORCE algorithm to further enhance QP, with RA-provided rewards as fine-grained training signals.
arXiv Detail & Related papers (2023-12-20T02:19:54Z)
A Preliminary Evaluation of ChatGPT for Zero-shot Dialogue Understanding [55.37338324658501]
Zero-shot dialogue understanding aims to enable dialogue to track the user's needs without any training data. In this work, we investigate the understanding ability of ChatGPT for zero-shot dialogue understanding tasks.
arXiv Detail & Related papers (2023-04-09T15:28:36Z)
Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks. For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial. We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z)
KILDST: Effective Knowledge-Integrated Learning for Dialogue State Tracking using Gazetteer and Speaker Information [3.342637296393915]
Dialogue State Tracking (DST) is core research in dialogue systems and has received much attention. It is necessary to define a new problem that can deal with dialogue between users as a step toward the conversational AI that extracts and recommends information from the dialogue between users. We introduce a new task - DST from dialogue between users about scheduling an event (DST-S) The DST-S task is much more challenging since it requires the model to understand and track dialogue in the dialogue between users and to understand who suggested the schedule and who agreed to the proposed schedule.
arXiv Detail & Related papers (2023-01-18T07:11:56Z)
Is MultiWOZ a Solved Task? An Interactive TOD Evaluation Framework with User Simulator [37.590563896382456]
We propose an interactive evaluation framework for Task-Oriented Dialogue (TOD) systems. We first build a goal-oriented user simulator based on pre-trained models and then use the user simulator to interact with the dialogue system to generate dialogues. Experimental results show that RL-based TOD systems trained by our proposed user simulator can achieve nearly 98% inform and success rates.
arXiv Detail & Related papers (2022-10-26T07:41:32Z)
GODEL: Large-Scale Pre-Training for Goal-Directed Dialog [119.1397031992088]
We introduce GODEL, a large pre-trained language model for dialog. We show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot fine-tuning setups. A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses.
arXiv Detail & Related papers (2022-06-22T18:19:32Z)
Few-Shot Bot: Prompt-Based Learning for Dialogue Systems [58.27337673451943]
Learning to converse using only a few examples is a great challenge in conversational AI. The current best conversational models are either good chit-chatters (e.g., BlenderBot) or goal-oriented systems (e.g., MinTL) We propose prompt-based few-shot learning which does not require gradient-based fine-tuning but instead uses a few examples as the only source of learning.
arXiv Detail & Related papers (2021-10-15T14:36:45Z)
Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue System [49.39150449455407]
HDNO is an option framework for designing latent dialogue acts to avoid designing specific dialogue act representations. We test HDNO on MultiWoz 2.0 and MultiWoz 2.1, the datasets on multi-domain dialogues, in comparison with word-level E2E model trained with RL, LaRL and HDSA.
arXiv Detail & Related papers (2020-06-11T20:55:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.