Low-Resource Adaptation of Open-Domain Generative Chatbots
- URL: http://arxiv.org/abs/2108.06329v1
- Date: Fri, 13 Aug 2021 17:40:30 GMT
- Title: Low-Resource Adaptation of Open-Domain Generative Chatbots
- Authors: Greyson Gerhard-Young, Raviteja Anantha, Srinivas Chappidi, Bj\"orn
Hoffmeister
- Abstract summary: We show that low parameter models can retain their general knowledge conversational abilities while improving in a specific domain.
We propose a generic framework that accounts for variety in question types, tracks reference throughout multi-turn conversations, and removes inconsistent and potentially toxic responses.
Our framework seamlessly transitions between chatting and performing transactional tasks, which will ultimately make interactions with digital assistants more human-like.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent work building open-domain chatbots has demonstrated that increasing
model size improves performance. On the other hand, latency and connectivity
considerations dictate the move of digital assistants on the device. Giving a
digital assistant like Siri, Alexa, or Google Assistant the ability to discuss
just about anything leads to the need for reducing the chatbot model size such
that it fits on the user's device. We demonstrate that low parameter models can
simultaneously retain their general knowledge conversational abilities while
improving in a specific domain. Additionally, we propose a generic framework
that accounts for variety in question types, tracks reference throughout
multi-turn conversations, and removes inconsistent and potentially toxic
responses. Our framework seamlessly transitions between chatting and performing
transactional tasks, which will ultimately make interactions with digital
assistants more human-like. We evaluate our framework on 1 internal and 4
public benchmark datasets using both automatic (Perplexity) and human (SSA -
Sensibleness and Specificity Average) evaluation metrics and establish
comparable performance while reducing model parameters by 90%.
Related papers
- Enhancing Chat Language Models by Scaling High-quality Instructional
Conversations [91.98516412612739]
We first provide a systematically designed, diverse, informative, large-scale dataset of instructional conversations, UltraChat.
Our objective is to capture the breadth of interactions that a human might have with an AI assistant.
We fine-tune a LLaMA model to create a powerful conversational model, UltraLLaMA.
arXiv Detail & Related papers (2023-05-23T16:49:14Z) - Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on
Self-Chat Data [101.63682141248069]
Chat models, such as ChatGPT, have shown impressive capabilities and have been rapidly adopted across numerous domains.
We propose a pipeline that can automatically generate a high-quality multi-turn chat corpus by leveraging ChatGPT.
We employ parameter-efficient tuning to enhance LLaMA, an open-source large language model.
arXiv Detail & Related papers (2023-04-03T17:59:09Z) - What is wrong with you?: Leveraging User Sentiment for Automatic Dialog
Evaluation [73.03318027164605]
We propose to use information that can be automatically extracted from the next user utterance as a proxy to measure the quality of the previous system response.
Our model generalizes across both spoken and written open-domain dialog corpora collected from real and paid users.
arXiv Detail & Related papers (2022-03-25T22:09:52Z) - Training Conversational Agents with Generative Conversational Networks [74.9941330874663]
We use Generative Conversational Networks to automatically generate data and train social conversational agents.
We evaluate our approach on TopicalChat with automatic metrics and human evaluators, showing that with 10% of seed data it performs close to the baseline that uses 100% of the data.
arXiv Detail & Related papers (2021-10-15T21:46:39Z) - A Taxonomy of Empathetic Response Intents in Human Social Conversations [1.52292571922932]
Open-domain conversational agents are becoming increasingly popular in the natural language processing community.
One of the challenges is enabling them to converse in an empathetic manner.
Current neural response generation methods rely solely on end-to-end learning from large scale conversation data to generate dialogues.
Recent work has shown the promise of combining dialogue act/intent modelling and neural response generation.
arXiv Detail & Related papers (2020-12-07T21:56:45Z) - Data-Efficient Methods for Dialogue Systems [4.061135251278187]
Conversational User Interface (CUI) has become ubiquitous in everyday life, in consumer-focused products like Siri and Alexa.
Deep learning underlies many recent breakthroughs in dialogue systems but requires very large amounts of training data, often annotated by experts.
In this thesis, we introduce a series of methods for training robust dialogue systems from minimal data.
arXiv Detail & Related papers (2020-12-05T02:51:09Z) - Local Knowledge Powered Conversational Agents [29.966845949225792]
We propose a dialog framework that incorporates both local knowledge as well as users' past dialogues to generate high quality conversations.
Using our framework and dataset, we demonstrate that incorporating local knowledge can largely improve informativeness, coherency and realisticness measures.
Our model with 8.3B parameters can generate human-like responses as rated by various human evaluations in a single-turn dialog setting.
arXiv Detail & Related papers (2020-10-20T09:34:40Z) - The Adapter-Bot: All-In-One Controllable Conversational Model [66.48164003532484]
We propose a dialogue model that uses a fixed backbone model such as DialGPT and triggers on-demand dialogue skills via different adapters.
Depending on the skills, the model is able to process multiple knowledge types, such as text, tables, and emphatic responses.
We evaluate our model using automatic evaluation by comparing it with existing state-of-the-art conversational models.
arXiv Detail & Related papers (2020-08-28T10:59:31Z) - Learning an Unreferenced Metric for Online Dialogue Evaluation [53.38078951628143]
We propose an unreferenced automated evaluation metric that uses large pre-trained language models to extract latent representations of utterances.
We show that our model achieves higher correlation with human annotations in an online setting, while not requiring true responses for comparison during inference.
arXiv Detail & Related papers (2020-05-01T20:01:39Z) - Recipes for building an open-domain chatbot [44.75975649076827]
Good conversation requires engaging talking points and listening to their partners, and displaying knowledge, empathy and personality appropriately.
We show that large scale models can learn these skills when given appropriate training data and choice of generation strategy.
We build variants of these recipes with 90M, 2.7B and 9.4B parameter models, and make our models and code publicly available.
arXiv Detail & Related papers (2020-04-28T16:33:25Z) - Hyperparameters optimization for Deep Learning based emotion prediction
for Human Robot Interaction [0.2549905572365809]
We have proposed an Inception module based Convolutional Neural Network Architecture.
The model is implemented in a humanoid robot, NAO in real time and robustness of the model is evaluated.
arXiv Detail & Related papers (2020-01-12T05:25:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.