Related papers: Response Generation with Context-Aware Prompt Learning

Response Generation with Context-Aware Prompt Learning

URL: http://arxiv.org/abs/2111.02643v1
Date: Thu, 4 Nov 2021 05:40:13 GMT
Title: Response Generation with Context-Aware Prompt Learning
Authors: Xiaodong Gu, Kang Min Yoo, Sang-Woo Lee
Abstract summary: We present a novel approach for pre-trained dialogue modeling that casts the dialogue generation problem as a prompt-learning task. Instead of fine-tuning on limited dialogue data, our approach, DialogPrompt, learns continuous prompt embeddings optimized for dialogue contexts. Our approach significantly outperforms the fine-tuning baseline and the generic prompt-learning methods.
Score: 19.340498579331555
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Pre-trained language models (PLM) have marked a huge leap in neural dialogue modeling. While PLMs are pre-trained on large-scale text corpora, they are usually fine-tuned on scarce dialogue data with specific domain knowledge and dialogue styles. However, tailoring the language models while fully utilizing prior knowledge in large pre-trained models remains a challenge. In this paper, we present a novel approach for pre-trained dialogue modeling that casts the dialogue generation problem as a prompt-learning task. Instead of fine-tuning on limited dialogue data, our approach, DialogPrompt, learns continuous prompt embeddings optimized for dialogue contexts, which appropriately elicit knowledge from the large pre-trained model. To encourage the model to better utilize the prompt embeddings, the prompt encoders are designed to be conditioned on the input dialogue context. Experiments on popular conversation datasets show that our approach significantly outperforms the fine-tuning baseline and the generic prompt-learning methods. Furthermore, human evaluations strongly support the superiority of DialogPrompt in regard to response generation quality.

Related papers

Contextual Data Augmentation for Task-Oriented Dialog Systems [8.085645180329417]
We develop a novel dialog augmentation model that generates a user turn, conditioning on full dialog context. With a new prompt design for language model, and output re-ranking, the dialogs generated from our model can be directly used to train downstream dialog systems.
arXiv Detail & Related papers (2023-10-16T13:22:34Z)
FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for Task-Oriented Dialogue [20.79359173822053]
We propose a novel dialogue pre-training model, FutureTOD, which distills future knowledge to the representation of the previous dialogue context. Our intuition is that a good dialogue representation both learns local context information and predicts future information.
arXiv Detail & Related papers (2023-06-17T10:40:07Z)
Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks. For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial. We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z)
Controllable Dialogue Simulation with In-Context Learning [39.04491297557292]
textscDialogic is a dialogue simulation method based on large language model in-context learning. Our method can rapidly expand a small set of dialogue data with minimum or zero human involvement. Our simulated dialogues have near-human fluency and annotation accuracy.
arXiv Detail & Related papers (2022-10-09T06:32:58Z)
GODEL: Large-Scale Pre-Training for Goal-Directed Dialog [119.1397031992088]
We introduce GODEL, a large pre-trained language model for dialog. We show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot fine-tuning setups. A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses.
arXiv Detail & Related papers (2022-06-22T18:19:32Z)
Post-Training Dialogue Summarization using Pseudo-Paraphrasing [12.083992819138716]
We propose to post-train pretrained language models (PLMs) to rephrase from dialogue to narratives. Comprehensive experiments show that our approach significantly improves vanilla PLMs on dialogue summarization.
arXiv Detail & Related papers (2022-04-28T13:42:19Z)
Back to the Future: Bidirectional Information Decoupling Network for Multi-turn Dialogue Modeling [80.51094098799736]
We propose Bidirectional Information Decoupling Network (BiDeN) as a universal dialogue encoder. BiDeN explicitly incorporates both the past and future contexts and can be generalized to a wide range of dialogue-related tasks. Experimental results on datasets of different downstream tasks demonstrate the universality and effectiveness of our BiDeN.
arXiv Detail & Related papers (2022-04-18T03:51:46Z)
Advances in Multi-turn Dialogue Comprehension: A Survey [51.215629336320305]
Training machines to understand natural language and interact with humans is an elusive and essential task of artificial intelligence. This paper reviews the previous methods from the technical perspective of dialogue modeling for the dialogue comprehension task. In addition, we categorize dialogue-related pre-training techniques which are employed to enhance PrLMs in dialogue scenarios.
arXiv Detail & Related papers (2021-10-11T03:52:37Z)
Dialogue-oriented Pre-training [70.03028879331339]
We propose three strategies to simulate the conversation features on general plain text. Dialog-PrLM is fine-tuned on three public multi-turn dialogue datasets.
arXiv Detail & Related papers (2021-06-01T12:02:46Z)
DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances [18.199473005335093]
This paper presents DialogBERT, a novel conversational response generation model that enhances previous PLM-based dialogue models. To efficiently capture the discourse-level coherence among utterances, we propose two training objectives, including masked utterance regression. Experiments on three multi-turn conversation datasets show that our approach remarkably outperforms the baselines.
arXiv Detail & Related papers (2020-12-03T09:06:23Z)
Knowledge Injection into Dialogue Generation via Language Models [85.65843021510521]
InjK is a two-stage approach to inject knowledge into a dialogue generation model. First, we train a large-scale language model and query it as textual knowledge. Second, we frame a dialogue generation model to sequentially generate textual knowledge and a corresponding response.
arXiv Detail & Related papers (2020-04-30T07:31:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.