Deploying Lifelong Open-Domain Dialogue Learning
- URL: http://arxiv.org/abs/2008.08076v2
- Date: Wed, 19 Aug 2020 16:03:27 GMT
- Title: Deploying Lifelong Open-Domain Dialogue Learning
- Authors: Kurt Shuster, Jack Urbanek, Emily Dinan, Arthur Szlam, Jason Weston
- Abstract summary: In this work, we build and deploy a role-playing game, whereby human players converse with learning agents situated in an open-domain fantasy world.
We show that by training models on the conversations they have with humans in the game the models progressively improve, as measured by automatic metrics and online engagement scores.
This learning is shown to be more efficient than crowdsourced data when applied to conversations with real users, as well as being far cheaper to collect.
- Score: 48.12600947313494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Much of NLP research has focused on crowdsourced static datasets and the
supervised learning paradigm of training once and then evaluating test
performance. As argued in de Vries et al. (2020), crowdsourced data has the
issues of lack of naturalness and relevance to real-world use cases, while the
static dataset paradigm does not allow for a model to learn from its
experiences of using language (Silver et al., 2013). In contrast, one might
hope for machine learning systems that become more useful as they interact with
people. In this work, we build and deploy a role-playing game, whereby human
players converse with learning agents situated in an open-domain fantasy world.
We show that by training models on the conversations they have with humans in
the game the models progressively improve, as measured by automatic metrics and
online engagement scores. This learning is shown to be more efficient than
crowdsourced data when applied to conversations with real users, as well as
being far cheaper to collect.
Related papers
- Towards a Zero-Data, Controllable, Adaptive Dialog System [27.75972750138208]
We explore approaches to generate data directly from dialog trees.
We show that agents trained on synthetic data can achieve comparable dialog success to models trained on human data.
arXiv Detail & Related papers (2024-03-26T10:45:11Z) - AutoConv: Automatically Generating Information-seeking Conversations
with Large Language Models [74.10293412011455]
We propose AutoConv for synthetic conversation generation.
Specifically, we formulate the conversation generation problem as a language modeling task.
We finetune an LLM with a few human conversations to capture the characteristics of the information-seeking process.
arXiv Detail & Related papers (2023-08-12T08:52:40Z) - Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning [35.67318830455459]
We develop a real-time, open-ended dialogue system that uses reinforcement learning (RL) to power a bot's conversational skill at scale.
Our work pairs the succinct embedding of the conversation state generated using SOTA (supervised) language models with RL techniques that are particularly suited to a dynamic action space.
arXiv Detail & Related papers (2022-07-25T16:12:33Z) - ValueNet: A New Dataset for Human Value Driven Dialogue System [103.2044265617704]
We present a new large-scale human value dataset called ValueNet, which contains human attitudes on 21,374 text scenarios.
Comprehensive empirical results show that the learned value model could benefit a wide range of dialogue tasks.
ValueNet is the first large-scale text dataset for human value modeling.
arXiv Detail & Related papers (2021-12-12T23:02:52Z) - What Matters in Learning from Offline Human Demonstrations for Robot
Manipulation [64.43440450794495]
We conduct an extensive study of six offline learning algorithms for robot manipulation.
Our study analyzes the most critical challenges when learning from offline human data.
We highlight opportunities for learning from human datasets.
arXiv Detail & Related papers (2021-08-06T20:48:30Z) - Wandering Within a World: Online Contextualized Few-Shot Learning [62.28521610606054]
We aim to bridge the gap between typical human and machine-learning environments by extending the standard framework of few-shot learning to an online setting.
We propose a new prototypical few-shot learning based on large scale indoor imagery that mimics the visual experience of an agent wandering within a world.
arXiv Detail & Related papers (2020-07-09T04:05:04Z) - Human Trajectory Forecasting in Crowds: A Deep Learning Perspective [89.4600982169]
We present an in-depth analysis of existing deep learning-based methods for modelling social interactions.
We propose two knowledge-based data-driven methods to effectively capture these social interactions.
We develop a large scale interaction-centric benchmark TrajNet++, a significant yet missing component in the field of human trajectory forecasting.
arXiv Detail & Related papers (2020-07-07T17:19:56Z) - Recipes for building an open-domain chatbot [44.75975649076827]
Good conversation requires engaging talking points and listening to their partners, and displaying knowledge, empathy and personality appropriately.
We show that large scale models can learn these skills when given appropriate training data and choice of generation strategy.
We build variants of these recipes with 90M, 2.7B and 9.4B parameter models, and make our models and code publicly available.
arXiv Detail & Related papers (2020-04-28T16:33:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.