Plug-and-Play Conversational Models
- URL: http://arxiv.org/abs/2010.04344v1
- Date: Fri, 9 Oct 2020 03:17:51 GMT
- Title: Plug-and-Play Conversational Models
- Authors: Andrea Madotto, Etsuko Ishii, Zhaojiang Lin, Sumanth Dathathri,
Pascale Fung
- Abstract summary: We introduce an approach that does not require further computation at decoding time, while also does not require any fine-tuning of a large language model.
We demonstrate, through extensive automatic and human evaluation, a high degree of control over the generated conversational responses with regard to multiple desired attributes.
- Score: 62.77150879036442
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There has been considerable progress made towards conversational models that
generate coherent and fluent responses; however, this often involves training
large language models on large dialogue datasets, such as Reddit. These large
conversational models provide little control over the generated responses, and
this control is further limited in the absence of annotated conversational
datasets for attribute specific generation that can be used for fine-tuning the
model. In this paper, we first propose and evaluate plug-and-play methods for
controllable response generation, which does not require dialogue specific
datasets and does not rely on fine-tuning a large model. While effective, the
decoding procedure induces considerable computational overhead, rendering the
conversational model unsuitable for interactive usage. To overcome this, we
introduce an approach that does not require further computation at decoding
time, while also does not require any fine-tuning of a large language model. We
demonstrate, through extensive automatic and human evaluation, a high degree of
control over the generated conversational responses with regard to multiple
desired attributes, while being fluent.
Related papers
- Modeling Real-Time Interactive Conversations as Timed Diarized Transcripts [11.067252960486272]
We present a simple yet general method to simulate real-time interactive conversations using pretrained language models.
We demonstrate the promise of this method with two case studies: instant messenger dialogues and spoken conversations.
arXiv Detail & Related papers (2024-05-21T21:14:31Z) - Evaluating Large Language Models in Semantic Parsing for Conversational
Question Answering over Knowledge Graphs [6.869834883252353]
This paper evaluates the performance of large language models that have not been explicitly pre-trained on this task.
Our results demonstrate that large language models are capable of generating graph queries from dialogues.
arXiv Detail & Related papers (2024-01-03T12:28:33Z) - Controllable Mixed-Initiative Dialogue Generation through Prompting [50.03458333265885]
Mixed-initiative dialogue tasks involve repeated exchanges of information and conversational control.
Agents gain control by generating responses that follow particular dialogue intents or strategies, prescribed by a policy planner.
Standard approach has been fine-tuning pre-trained language models to perform generation conditioned on these intents.
We instead prompt large language models as a drop-in replacement to fine-tuning on conditional generation.
arXiv Detail & Related papers (2023-05-06T23:11:25Z) - Stabilized In-Context Learning with Pre-trained Language Models for Few
Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks.
For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial.
We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z) - Prompting for a conversation: How to control a dialog model? [9.268682116424518]
Dialog models are trained on a large amount of text, yet their responses need to be limited to a desired scope and style of a dialog agent.
Because the datasets used to achieve the former contain language that is not compatible with the latter, pre-trained dialog models are fine-tuned on smaller curated datasets.
In this paper we investigate if prompting can mitigate the above trade-off.
arXiv Detail & Related papers (2022-09-22T14:59:55Z) - GODEL: Large-Scale Pre-Training for Goal-Directed Dialog [119.1397031992088]
We introduce GODEL, a large pre-trained language model for dialog.
We show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot fine-tuning setups.
A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses.
arXiv Detail & Related papers (2022-06-22T18:19:32Z) - Response Generation with Context-Aware Prompt Learning [19.340498579331555]
We present a novel approach for pre-trained dialogue modeling that casts the dialogue generation problem as a prompt-learning task.
Instead of fine-tuning on limited dialogue data, our approach, DialogPrompt, learns continuous prompt embeddings optimized for dialogue contexts.
Our approach significantly outperforms the fine-tuning baseline and the generic prompt-learning methods.
arXiv Detail & Related papers (2021-11-04T05:40:13Z) - CloneBot: Personalized Dialogue-Response Predictions [0.0]
The project task was to create a model that, given a speaker ID, chat history, and an utterance query, can predict the response utterance in a conversation.
The model is personalized for each speaker. This task can be a useful tool for building speech bots that talk in a human-like manner in a live conversation.
arXiv Detail & Related papers (2021-03-31T01:15:37Z) - The Adapter-Bot: All-In-One Controllable Conversational Model [66.48164003532484]
We propose a dialogue model that uses a fixed backbone model such as DialGPT and triggers on-demand dialogue skills via different adapters.
Depending on the skills, the model is able to process multiple knowledge types, such as text, tables, and emphatic responses.
We evaluate our model using automatic evaluation by comparing it with existing state-of-the-art conversational models.
arXiv Detail & Related papers (2020-08-28T10:59:31Z) - Learning an Unreferenced Metric for Online Dialogue Evaluation [53.38078951628143]
We propose an unreferenced automated evaluation metric that uses large pre-trained language models to extract latent representations of utterances.
We show that our model achieves higher correlation with human annotations in an online setting, while not requiring true responses for comparison during inference.
arXiv Detail & Related papers (2020-05-01T20:01:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.