Large Scale Multi-Actor Generative Dialog Modeling
- URL: http://arxiv.org/abs/2005.06114v1
- Date: Wed, 13 May 2020 01:56:00 GMT
- Title: Large Scale Multi-Actor Generative Dialog Modeling
- Authors: Alex Boyd, Raul Puri, Mohammad Shoeybi, Mostofa Patwary, and Bryan
Catanzaro
- Abstract summary: We introduce the Generative Conversation Control model, a language model that conditions on past reference conversations to probabilistically model multi-turn conversations in the actor's persona.
scaling model sizes from 117M to 8.3B parameters yields an improvement from 23.14 to 13.14 perplexity on 1.7M held out Reddit conversations.
We find that conditionally modeling past conversations improves perplexity by 0.47 in automatic evaluations.
- Score: 22.286624163849893
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Non-goal oriented dialog agents (i.e. chatbots) aim to produce varying and
engaging conversations with a user; however, they typically exhibit either
inconsistent personality across conversations or the average personality of all
users. This paper addresses these issues by controlling an agent's persona upon
generation via conditioning on prior conversations of a target actor. In doing
so, we are able to utilize more abstract patterns within a person's speech and
better emulate them in generated responses. This work introduces the Generative
Conversation Control model, an augmented and fine-tuned GPT-2 language model
that conditions on past reference conversations to probabilistically model
multi-turn conversations in the actor's persona. We introduce an accompanying
data collection procedure to obtain 10.3M conversations from 6 months worth of
Reddit comments. We demonstrate that scaling model sizes from 117M to 8.3B
parameters yields an improvement from 23.14 to 13.14 perplexity on 1.7M held
out Reddit conversations. Increasing model scale yielded similar improvements
in human evaluations that measure preference of model samples to the held out
target distribution in terms of realism (31% increased to 37% preference),
style matching (37% to 42%), grammar and content quality (29% to 42%), and
conversation coherency (32% to 40%). We find that conditionally modeling past
conversations improves perplexity by 0.47 in automatic evaluations. Through
human trials we identify positive trends between conditional modeling and style
matching and outline steps to further improve persona control.
Related papers
- Grounding Language in Multi-Perspective Referential Communication [16.421832484760987]
We introduce a task and dataset for referring expression generation and comprehension in multi-agent embodied environments.
We collect a dataset of 2,970 human-written referring expressions, each paired with human comprehension judgments.
We evaluate the performance of automated models as speakers and listeners paired with human partners, finding that model performance in both reference generation and comprehension lags behind that of pairs of human agents.
arXiv Detail & Related papers (2024-10-04T22:42:30Z) - Estimating Contribution Quality in Online Deliberations Using a Large Language Model [4.911986505938227]
We use a large language model (LLM) alongside eight human annotators to rate contributions based on justification, novelty, expansion of the conversation, and potential for further expansion.
Using the average rating from other human annotators as the ground truth, we find the model outperforms individual human annotators.
We illustrate the usefulness of the automated quality rating by assessing the effect of nudges on the quality of deliberation.
arXiv Detail & Related papers (2024-08-21T18:41:32Z) - Faithful Persona-based Conversational Dataset Generation with Large
Language Models [10.506653172302222]
High-quality conversational datasets are essential for developing AI models that can communicate with users.
We propose a Generator-Critic architecture framework to expand the initial dataset, while improving the quality of its conversations.
We release Synthetic-Persona-Chat, consisting of 20k conversations seeded from Persona-Chat.
arXiv Detail & Related papers (2023-12-15T18:23:50Z) - Enhancing Chat Language Models by Scaling High-quality Instructional
Conversations [91.98516412612739]
We first provide a systematically designed, diverse, informative, large-scale dataset of instructional conversations, UltraChat.
Our objective is to capture the breadth of interactions that a human might have with an AI assistant.
We fine-tune a LLaMA model to create a powerful conversational model, UltraLLaMA.
arXiv Detail & Related papers (2023-05-23T16:49:14Z) - A Model-Agnostic Data Manipulation Method for Persona-based Dialogue
Generation [107.82729587882397]
It is expensive to scale up current persona-based dialogue datasets.
Each data sample in this task is more complex to learn with than conventional dialogue data.
We propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model.
arXiv Detail & Related papers (2022-04-21T03:49:54Z) - Plug-and-Play Conversational Models [62.77150879036442]
We introduce an approach that does not require further computation at decoding time, while also does not require any fine-tuning of a large language model.
We demonstrate, through extensive automatic and human evaluation, a high degree of control over the generated conversational responses with regard to multiple desired attributes.
arXiv Detail & Related papers (2020-10-09T03:17:51Z) - Dialogue Response Ranking Training with Large-Scale Human Feedback Data [52.12342165926226]
We leverage social media feedback data to build a large-scale training dataset for feedback prediction.
We trained DialogRPT, a set of GPT-2 based models on 133M pairs of human feedback data.
Our ranker outperforms the conventional dialog perplexity baseline with a large margin on predicting Reddit feedback.
arXiv Detail & Related papers (2020-09-15T10:50:05Z) - The Adapter-Bot: All-In-One Controllable Conversational Model [66.48164003532484]
We propose a dialogue model that uses a fixed backbone model such as DialGPT and triggers on-demand dialogue skills via different adapters.
Depending on the skills, the model is able to process multiple knowledge types, such as text, tables, and emphatic responses.
We evaluate our model using automatic evaluation by comparing it with existing state-of-the-art conversational models.
arXiv Detail & Related papers (2020-08-28T10:59:31Z) - Recipes for building an open-domain chatbot [44.75975649076827]
Good conversation requires engaging talking points and listening to their partners, and displaying knowledge, empathy and personality appropriately.
We show that large scale models can learn these skills when given appropriate training data and choice of generation strategy.
We build variants of these recipes with 90M, 2.7B and 9.4B parameter models, and make our models and code publicly available.
arXiv Detail & Related papers (2020-04-28T16:33:25Z) - Modality-Balanced Models for Visual Dialogue [102.35406085738325]
The Visual Dialog task requires a model to exploit both image and conversational context information to generate the next response to the dialogue.
We show that previous joint-modality (history and image) models over-rely on and are more prone to memorizing the dialogue history.
We present methods for this integration of the two models, via ensemble and consensus dropout fusion with shared parameters.
arXiv Detail & Related papers (2020-01-17T14:57:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.