CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement
Learning
- URL: http://arxiv.org/abs/2204.08426v1
- Date: Mon, 18 Apr 2022 17:43:21 GMT
- Title: CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement
Learning
- Authors: Siddharth Verma, Justin Fu, Mengjiao Yang, Sergey Levine
- Abstract summary: offline reinforcement learning can be used to train dialogue agents entirely using static datasets collected from human speakers.
Experiments show that recently developed offline RL methods can be combined with language models to yield realistic dialogue agents.
- Score: 85.3987745097806
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conventionally, generation of natural language for dialogue agents may be
viewed as a statistical learning problem: determine the patterns in
human-provided data and generate appropriate responses with similar statistical
properties. However, dialogue can also be regarded as a goal directed process,
where speakers attempt to accomplish a specific task. Reinforcement learning
(RL) algorithms are designed specifically for solving such goal-directed
problems, but the most direct way to apply RL -- through trial-and-error
learning in human conversations, -- is costly. In this paper, we study how
offline reinforcement learning can instead be used to train dialogue agents
entirely using static datasets collected from human speakers. Our experiments
show that recently developed offline RL methods can be combined with language
models to yield realistic dialogue agents that better accomplish task goals.
Related papers
- Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations [58.65755268815283]
Many real dialogues are interactive, meaning an agent's utterances will influence their conversational partner, elicit information, or change their opinion.
We use this fact to rewrite and augment existing suboptimal data, and train via offline reinforcement learning (RL) an agent that outperforms both prompting and learning from unaltered human demonstrations.
Our results in a user study with real humans show that our approach greatly outperforms existing state-of-the-art dialogue agents.
arXiv Detail & Related papers (2024-11-07T21:37:51Z) - Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations [70.7884839812069]
Large language models (LLMs) have emerged as powerful and general solutions to many natural language tasks.
However, many of the most important applications of language generation are interactive, where an agent has to talk to a person to reach a desired outcome.
In this work, we explore a new method for adapting LLMs with RL for such goal-directed dialogue.
arXiv Detail & Related papers (2023-11-09T18:45:16Z) - KETOD: Knowledge-Enriched Task-Oriented Dialogue [77.59814785157877]
Existing studies in dialogue system research mostly treat task-oriented dialogue and chit-chat as separate domains.
We investigate how task-oriented dialogue and knowledge-grounded chit-chat can be effectively integrated into a single model.
arXiv Detail & Related papers (2022-05-11T16:01:03Z) - CloneBot: Personalized Dialogue-Response Predictions [0.0]
The project task was to create a model that, given a speaker ID, chat history, and an utterance query, can predict the response utterance in a conversation.
The model is personalized for each speaker. This task can be a useful tool for building speech bots that talk in a human-like manner in a live conversation.
arXiv Detail & Related papers (2021-03-31T01:15:37Z) - Automatic Curriculum Learning With Over-repetition Penalty for Dialogue
Policy Learning [8.744026064255337]
We propose a novel framework, Automatic Curriculum Learning-based Deep Q-Network (ACL-DQN), to realize the dialogue policy for automatic curriculum learning.
The teacher model arranges a meaningful ordered curriculum and automatically adjusts it by monitoring the learning progress of the dialogue agent.
Experiments show that the ACL-DQN significantly improves the effectiveness and stability of dialogue tasks with a statistically significant margin.
arXiv Detail & Related papers (2020-12-28T02:44:49Z) - Human-centric Dialog Training via Offline Reinforcement Learning [16.525761580699257]
We develop a novel class of offline reinforcement learning algorithms.
We test the resulting dialog model with ratings from 80 users in an open-domain setting.
arXiv Detail & Related papers (2020-10-12T16:53:00Z) - Learning an Effective Context-Response Matching Model with
Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination.
We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner.
Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z) - TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling.
To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.