Prompting for a conversation: How to control a dialog model?
- URL: http://arxiv.org/abs/2209.11068v1
- Date: Thu, 22 Sep 2022 14:59:55 GMT
- Title: Prompting for a conversation: How to control a dialog model?
- Authors: Josef Valvoda, Yimai Fang, David Vandyke
- Abstract summary: Dialog models are trained on a large amount of text, yet their responses need to be limited to a desired scope and style of a dialog agent.
Because the datasets used to achieve the former contain language that is not compatible with the latter, pre-trained dialog models are fine-tuned on smaller curated datasets.
In this paper we investigate if prompting can mitigate the above trade-off.
- Score: 9.268682116424518
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dialog modelling faces a difficult trade-off. Models are trained on a large
amount of text, yet their responses need to be limited to a desired scope and
style of a dialog agent. Because the datasets used to achieve the former
contain language that is not compatible with the latter, pre-trained dialog
models are fine-tuned on smaller curated datasets. However, the fine-tuning
process robs them of the ability to produce diverse responses, eventually
reducing them to dull conversation partners. In this paper we investigate if
prompting can mitigate the above trade-off. Specifically, we experiment with
conditioning the prompt on the query, rather than training a single prompt for
all queries. By following the intuition that freezing the pre-trained language
model will conserve its expressivity, we find that compared to fine-tuning,
prompting can achieve a higher BLEU score and substantially improve the
diversity and novelty of the responses.
Related papers
- Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - Stabilized In-Context Learning with Pre-trained Language Models for Few
Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks.
For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial.
We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z) - Contextual Dynamic Prompting for Response Generation in Task-oriented
Dialog Systems [8.419582942080927]
Response generation is one of the critical components in task-oriented dialog systems.
We propose an approach that performs textit dynamic prompting where the prompts are learnt from dialog contexts.
We show that contextual dynamic prompts improve response generation in terms of textit combined score citemehri-etal 2019-structured by 3 absolute points.
arXiv Detail & Related papers (2023-01-30T20:26:02Z) - AutoReply: Detecting Nonsense in Dialogue Introspectively with
Discriminative Replies [71.62832112141913]
We show that dialogue models can detect errors in their own messages introspectively, by calculating the likelihood of replies that are indicative of poor messages.
We first show that hand-crafted replies can be effective for the task of detecting nonsense in applications as complex as Diplomacy.
We find that AutoReply-generated replies outperform handcrafted replies and perform on par with carefully fine-tuned large supervised models.
arXiv Detail & Related papers (2022-11-22T22:31:34Z) - Controllable Dialogue Simulation with In-Context Learning [39.04491297557292]
textscDialogic is a dialogue simulation method based on large language model in-context learning.
Our method can rapidly expand a small set of dialogue data with minimum or zero human involvement.
Our simulated dialogues have near-human fluency and annotation accuracy.
arXiv Detail & Related papers (2022-10-09T06:32:58Z) - Adapting Task-Oriented Dialogue Models for Email Conversations [4.45709593827781]
In this paper, we provide an effective transfer learning framework (EMToD) that allows the latest development in dialogue models to be adapted for long-form conversations.
We show that the proposed EMToD framework improves intent detection performance over pre-trained language models by 45% and over pre-trained dialogue models by 30% for task-oriented email conversations.
arXiv Detail & Related papers (2022-08-19T16:41:34Z) - GODEL: Large-Scale Pre-Training for Goal-Directed Dialog [119.1397031992088]
We introduce GODEL, a large pre-trained language model for dialog.
We show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot fine-tuning setups.
A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses.
arXiv Detail & Related papers (2022-06-22T18:19:32Z) - Response Generation with Context-Aware Prompt Learning [19.340498579331555]
We present a novel approach for pre-trained dialogue modeling that casts the dialogue generation problem as a prompt-learning task.
Instead of fine-tuning on limited dialogue data, our approach, DialogPrompt, learns continuous prompt embeddings optimized for dialogue contexts.
Our approach significantly outperforms the fine-tuning baseline and the generic prompt-learning methods.
arXiv Detail & Related papers (2021-11-04T05:40:13Z) - The Adapter-Bot: All-In-One Controllable Conversational Model [66.48164003532484]
We propose a dialogue model that uses a fixed backbone model such as DialGPT and triggers on-demand dialogue skills via different adapters.
Depending on the skills, the model is able to process multiple knowledge types, such as text, tables, and emphatic responses.
We evaluate our model using automatic evaluation by comparing it with existing state-of-the-art conversational models.
arXiv Detail & Related papers (2020-08-28T10:59:31Z) - Dialog without Dialog Data: Learning Visual Dialog Agents from VQA Data [75.7372052716556]
"Dialog without Dialog" requires agents to develop dialog models that can adapt to new tasks without language level supervision.
By factorizing intention and language, our model minimizes linguistic drift after fine-tuning for new tasks.
arXiv Detail & Related papers (2020-07-24T19:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.