Rewarding Chatbots for Real-World Engagement with Millions of Users
- URL: http://arxiv.org/abs/2303.06135v2
- Date: Thu, 30 Mar 2023 18:28:05 GMT
- Title: Rewarding Chatbots for Real-World Engagement with Millions of Users
- Authors: Robert Irvine, Douglas Boubert, Vyas Raina, Adian Liusie, Ziyi Zhu,
Vineet Mudupalli, Aliaksei Korshuk, Zongyi Liu, Fritz Cremer, Valentin
Assassi, Christie-Carol Beauchamp, Xiaoding Lu, Thomas Rialan, William
Beauchamp
- Abstract summary: This work investigates the development of social chatbots that prioritize user engagement to enhance retention.
The proposed approach uses automatic pseudo-labels collected from user interactions to train a reward model that can be used to reject low-scoring sample responses.
A/B testing on groups of 10,000 new dailychat users on the Chai Research platform shows that this approach increases the MCL by up to 70%.
Future work aims to use the reward model to realise a data fly-wheel, where the latest user conversations can be used to alternately fine-tune the language model and the reward model.
- Score: 1.2583983802175422
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The emergence of pretrained large language models has led to the deployment
of a range of social chatbots for chitchat. Although these chatbots demonstrate
language ability and fluency, they are not guaranteed to be engaging and can
struggle to retain users. This work investigates the development of social
chatbots that prioritize user engagement to enhance retention, specifically
examining the use of human feedback to efficiently develop highly engaging
chatbots. The proposed approach uses automatic pseudo-labels collected from
user interactions to train a reward model that can be used to reject
low-scoring sample responses generated by the chatbot model at inference time.
Intuitive evaluation metrics, such as mean conversation length (MCL), are
introduced as proxies to measure the level of engagement of deployed chatbots.
A/B testing on groups of 10,000 new daily chatbot users on the Chai Research
platform shows that this approach increases the MCL by up to 70%, which
translates to a more than 30% increase in user retention for a GPT-J 6B model.
Future work aims to use the reward model to realise a data fly-wheel, where the
latest user conversations can be used to alternately fine-tune the language
model and the reward model.
Related papers
- LLM Roleplay: Simulating Human-Chatbot Interaction [52.03241266241294]
We propose a goal-oriented, persona-based method to automatically generate diverse multi-turn dialogues simulating human-chatbot interaction.
Our method can simulate human-chatbot dialogues with a high indistinguishability rate.
arXiv Detail & Related papers (2024-07-04T14:49:46Z) - WildChat: 1M ChatGPT Interaction Logs in the Wild [88.05964311416717]
WildChat is a corpus of 1 million user-ChatGPT conversations, which consists of over 2.5 million interaction turns.
In addition to timestamped chat transcripts, we enrich the dataset with demographic data, including state, country, and hashed IP addresses.
arXiv Detail & Related papers (2024-05-02T17:00:02Z) - Prompted LLMs as Chatbot Modules for Long Open-domain Conversation [7.511596831927614]
We propose MPC, a new approach for creating high-quality conversational agents without the need for fine-tuning.
Our method utilizes pre-trained large language models (LLMs) as individual modules for long-term consistency and flexibility.
arXiv Detail & Related papers (2023-05-08T08:09:00Z) - Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on
Self-Chat Data [101.63682141248069]
Chat models, such as ChatGPT, have shown impressive capabilities and have been rapidly adopted across numerous domains.
We propose a pipeline that can automatically generate a high-quality multi-turn chat corpus by leveraging ChatGPT.
We employ parameter-efficient tuning to enhance LLaMA, an open-source large language model.
arXiv Detail & Related papers (2023-04-03T17:59:09Z) - Leveraging Large Language Models to Power Chatbots for Collecting User
Self-Reported Data [15.808841433843742]
Large language models (LLMs) provide a new way to build chatbots by accepting natural language prompts.
We explore what design factors of prompts can help steer chatbots to talk naturally and collect data reliably.
arXiv Detail & Related papers (2023-01-14T07:29:36Z) - Training Conversational Agents with Generative Conversational Networks [74.9941330874663]
We use Generative Conversational Networks to automatically generate data and train social conversational agents.
We evaluate our approach on TopicalChat with automatic metrics and human evaluators, showing that with 10% of seed data it performs close to the baseline that uses 100% of the data.
arXiv Detail & Related papers (2021-10-15T21:46:39Z) - Put Chatbot into Its Interlocutor's Shoes: New Framework to Learn
Chatbot Responding with Intention [55.77218465471519]
This paper proposes an innovative framework to train chatbots to possess human-like intentions.
Our framework included a guiding robot and an interlocutor model that plays the role of humans.
We examined our framework using three experimental setups and evaluate the guiding robot with four different metrics to demonstrated flexibility and performance advantages.
arXiv Detail & Related papers (2021-03-30T15:24:37Z) - Learning Improvised Chatbots from Adversarial Modifications of Natural
Language Feedback [19.026954124876582]
We propose a generative adversarial model that converts noisy feedback into a plausible natural response in a conversation.
The generator's goal is to convert the feedback into a response that answers the user's previous utterance and to fool the discriminator.
arXiv Detail & Related papers (2020-10-14T17:33:37Z) - Pchatbot: A Large-Scale Dataset for Personalized Chatbot [49.16746174238548]
We introduce Pchatbot, a large-scale dialogue dataset that contains two subsets collected from Weibo and Judicial forums respectively.
To adapt the raw dataset to dialogue systems, we elaborately normalize the raw dataset via processes such as anonymization.
The scale of Pchatbot is significantly larger than existing Chinese datasets, which might benefit the data-driven models.
arXiv Detail & Related papers (2020-09-28T12:49:07Z) - Personalized Chatbot Trustworthiness Ratings [19.537492400265577]
We envision a personalized rating methodology for chatbots that relies on separate rating modules for each issue.
The method is independent of the specific trust issues and is parametric to the aggregation procedure.
arXiv Detail & Related papers (2020-05-13T22:42:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.