GoChat: Goal-oriented Chatbots with Hierarchical Reinforcement Learning
- URL: http://arxiv.org/abs/2005.11729v2
- Date: Tue, 26 May 2020 04:03:33 GMT
- Title: GoChat: Goal-oriented Chatbots with Hierarchical Reinforcement Learning
- Authors: Jianfeng Liu, Feiyang Pan, Ling Luo
- Abstract summary: GoChat is a framework for end-to-end training to maximize the longterm return from offline multi-turn dialogue datasets.
Our framework utilizes hierarchical reinforcement learning (HRL), where the high-level policy guides the conversation towards the final goal.
In our experiments on a real-world dialogue dataset for anti-fraud in financial, our approach outperforms previous methods on both the quality of response generation as well as the success rate of accomplishing the goal.
- Score: 10.514163160735926
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A chatbot that converses like a human should be goal-oriented (i.e., be
purposeful in conversation), which is beyond language generation. However,
existing dialogue systems often heavily rely on cumbersome hand-crafted rules
or costly labelled datasets to reach the goals. In this paper, we propose
Goal-oriented Chatbots (GoChat), a framework for end-to-end training chatbots
to maximize the longterm return from offline multi-turn dialogue datasets. Our
framework utilizes hierarchical reinforcement learning (HRL), where the
high-level policy guides the conversation towards the final goal by determining
some sub-goals, and the low-level policy fulfills the sub-goals by generating
the corresponding utterance for response. In our experiments on a real-world
dialogue dataset for anti-fraud in financial, our approach outperforms previous
methods on both the quality of response generation as well as the success rate
of accomplishing the goal.
Related papers
- ChatWise: A Strategy-Guided Chatbot for Enhancing Cognitive Support in Older Adults [38.064067293831066]
We propose a strategy-guided AI chatbots named ChatWise that follows a dual-level conversation reasoning framework.<n>It integrates macro-level strategy planning and micro-level utterance generation to enable engaging, multi-turn dialogue tailored to older adults.
arXiv Detail & Related papers (2025-02-19T21:32:09Z) - Goal Inference from Open-Ended Dialog [6.21910767424247]
We present an online method for embodied agents to learn and accomplish diverse user goals.
We extract natural language goal representations from conversations with Large Language Models.
As a result, our method can represent uncertainty over complex goals based on unrestricted dialog.
arXiv Detail & Related papers (2024-10-17T18:30:52Z) - LLM Roleplay: Simulating Human-Chatbot Interaction [52.03241266241294]
We propose a goal-oriented, persona-based method to automatically generate diverse multi-turn dialogues simulating human-chatbot interaction.
Our method can simulate human-chatbot dialogues with a high indistinguishability rate.
arXiv Detail & Related papers (2024-07-04T14:49:46Z) - Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations [70.7884839812069]
Large language models (LLMs) have emerged as powerful and general solutions to many natural language tasks.
However, many of the most important applications of language generation are interactive, where an agent has to talk to a person to reach a desired outcome.
In this work, we explore a new method for adapting LLMs with RL for such goal-directed dialogue.
arXiv Detail & Related papers (2023-11-09T18:45:16Z) - Dialogue Planning via Brownian Bridge Stochastic Process for
Goal-directed Proactive Dialogue [9.99763097964222]
Goal-directed dialogue systems aim to proactively reach a pre-determined target through multi-turn conversations.
Key to achieving this task lies in planning dialogue paths that smoothly and coherently direct conversations towards the target.
We propose a coherent dialogue planning approach that uses a process to model the temporal dynamics of dialogue paths.
arXiv Detail & Related papers (2023-05-09T09:28:23Z) - Improving a sequence-to-sequence nlp model using a reinforcement
learning policy algorithm [0.0]
Current neural network models of dialogue generation show great promise for generating answers for chatty agents.
But they are short-sighted in that they predict utterances one at a time while disregarding their impact on future outcomes.
This work commemorates a preliminary step toward developing a neural conversational model based on the long-term success of dialogues.
arXiv Detail & Related papers (2022-12-28T22:46:57Z) - Target-Guided Dialogue Response Generation Using Commonsense and Data
Augmentation [32.764356638437214]
We introduce a new technique for target-guided response generation.
We also propose techniques to re-purpose existing dialogue datasets for target-guided generation.
Our work generally enables dialogue system designers to exercise more control over the conversations that their systems produce.
arXiv Detail & Related papers (2022-05-19T04:01:40Z) - CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement
Learning [85.3987745097806]
offline reinforcement learning can be used to train dialogue agents entirely using static datasets collected from human speakers.
Experiments show that recently developed offline RL methods can be combined with language models to yield realistic dialogue agents.
arXiv Detail & Related papers (2022-04-18T17:43:21Z) - Put Chatbot into Its Interlocutor's Shoes: New Framework to Learn
Chatbot Responding with Intention [55.77218465471519]
This paper proposes an innovative framework to train chatbots to possess human-like intentions.
Our framework included a guiding robot and an interlocutor model that plays the role of humans.
We examined our framework using three experimental setups and evaluate the guiding robot with four different metrics to demonstrated flexibility and performance advantages.
arXiv Detail & Related papers (2021-03-30T15:24:37Z) - I love your chain mail! Making knights smile in a fantasy game world:
Open-domain goal-oriented dialogue agents [69.68400056148336]
We train a goal-oriented model with reinforcement learning against an imitation-learned chit-chat'' model.
We show that both models outperform an inverse model baseline and can converse naturally with their dialogue partner in order to achieve goals.
arXiv Detail & Related papers (2020-02-07T16:22:36Z) - Dynamic Knowledge Routing Network For Target-Guided Open-Domain
Conversation [79.7781436501706]
We propose a structured approach that controls the intended content of system responses by introducing coarse-grained keywords.
We also propose a novel dual discourse-level target-guided strategy to guide conversations to reach their goals smoothly with higher success rate.
arXiv Detail & Related papers (2020-02-04T09:49:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.