KPT: Keyword-guided Pre-training for Grounded Dialog Generation
- URL: http://arxiv.org/abs/2212.01739v1
- Date: Sun, 4 Dec 2022 04:05:01 GMT
- Title: KPT: Keyword-guided Pre-training for Grounded Dialog Generation
- Authors: Qi Zhu, Fei Mi, Zheng Zhang, Yasheng Wang, Yitong Li, Xin Jiang, Qun
Liu, Xiaoyan Zhu, Minlie Huang
- Abstract summary: We propose KPT (guided Pre-Training), a novel self-supervised pre-training method for grounded dialog generation.
Specifically, we use a pre-trained language model to extract the most uncertain tokens in the dialog as keywords.
We conduct extensive experiments on various few-shot knowledge-grounded generation tasks, including grounding on dialog acts, knowledge graphs, persona descriptions, and Wikipedia passages.
- Score: 82.68787152707455
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Incorporating external knowledge into the response generation process is
essential to building more helpful and reliable dialog agents. However,
collecting knowledge-grounded conversations is often costly, calling for a
better pre-trained model for grounded dialog generation that generalizes well
w.r.t. different types of knowledge. In this work, we propose KPT
(Keyword-guided Pre-Training), a novel self-supervised pre-training method for
grounded dialog generation without relying on extra knowledge annotation.
Specifically, we use a pre-trained language model to extract the most uncertain
tokens in the dialog as keywords. With these keywords, we construct two kinds
of knowledge and pre-train a knowledge-grounded response generation model,
aiming at handling two different scenarios: (1) the knowledge should be
faithfully grounded; (2) it can be selectively used. For the former, the
grounding knowledge consists of keywords extracted from the response. For the
latter, the grounding knowledge is additionally augmented with keywords
extracted from other utterances in the same dialog. Since the knowledge is
extracted from the dialog itself, KPT can be easily performed on a large volume
and variety of dialogue data. We considered three data sources (open-domain,
task-oriented, conversational QA) with a total of 2.5M dialogues. We conduct
extensive experiments on various few-shot knowledge-grounded generation tasks,
including grounding on dialog acts, knowledge graphs, persona descriptions, and
Wikipedia passages. Our comprehensive experiments and analyses demonstrate that
KPT consistently outperforms state-of-the-art methods on these tasks with
diverse grounding knowledge.
Related papers
- PK-Chat: Pointer Network Guided Knowledge Driven Generative Dialogue
Model [79.64376762489164]
PK-Chat is a Pointer network guided generative dialogue model, incorporating a unified pretrained language model and a pointer network over knowledge graphs.
The words generated by PK-Chat in the dialogue are derived from the prediction of word lists and the direct prediction of the external knowledge graph knowledge.
Based on the PK-Chat, a dialogue system is built for academic scenarios in the case of geosciences.
arXiv Detail & Related papers (2023-04-02T18:23:13Z) - PK-ICR: Persona-Knowledge Interactive Context Retrieval for Grounded Dialogue [21.266410719325208]
Persona and Knowledge Dual Context Identification is a task to identify persona and knowledge jointly for a given dialogue.
We develop a novel grounding retrieval method that utilizes all contexts of dialogue simultaneously.
arXiv Detail & Related papers (2023-02-13T20:27:26Z) - Position Matters! Empirical Study of Order Effect in Knowledge-grounded
Dialogue [54.98184262897166]
We investigate how the order of the knowledge set can influence autoregressive dialogue systems' responses.
We propose a simple and novel technique to alleviate the order effect by modifying the position embeddings of knowledge input.
arXiv Detail & Related papers (2023-02-12T10:13:00Z) - SPACE-2: Tree-Structured Semi-Supervised Contrastive Pre-training for
Task-Oriented Dialog Understanding [68.94808536012371]
We propose a tree-structured pre-trained conversation model, which learns dialog representations from limited labeled dialogs and large-scale unlabeled dialog corpora.
Our method can achieve new state-of-the-art results on the DialoGLUE benchmark consisting of seven datasets and four popular dialog understanding tasks.
arXiv Detail & Related papers (2022-09-14T13:42:50Z) - HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on
Tabular and Textual Data [87.67278915655712]
We present a new dialogue dataset, HybriDialogue, which consists of crowdsourced natural conversations grounded on both Wikipedia text and tables.
The conversations are created through the decomposition of complex multihop questions into simple, realistic multiturn dialogue interactions.
arXiv Detail & Related papers (2022-04-28T00:52:16Z) - Achieving Conversational Goals with Unsupervised Post-hoc Knowledge
Injection [37.15893335147598]
A limitation of current neural dialog models is that they tend to suffer from a lack of specificity and informativeness in generated responses.
We propose a post-hoc knowledge-injection technique where we first retrieve a diverse set of relevant knowledge snippets conditioned on both the dialog history and an initial response from an existing dialog model.
We construct multiple candidate responses, individually injecting each retrieved snippet into the initial response using a gradient-based decoding method, and then select the final response with an unsupervised ranking step.
arXiv Detail & Related papers (2022-03-22T00:42:27Z) - Multi-turn Dialogue Reading Comprehension with Pivot Turns and Knowledge [43.352833140317486]
Multi-turn dialogue reading comprehension aims to teach machines to read dialogue contexts and solve tasks such as response selection and answering questions.
This work makes the first attempt to tackle the above two challenges by extracting substantially important turns as pivot utterances.
We propose a pivot-oriented deep selection model (PoDS) on top of the Transformer-based language models for dialogue comprehension.
arXiv Detail & Related papers (2021-02-10T15:00:12Z) - Knowledge Injection into Dialogue Generation via Language Models [85.65843021510521]
InjK is a two-stage approach to inject knowledge into a dialogue generation model.
First, we train a large-scale language model and query it as textual knowledge.
Second, we frame a dialogue generation model to sequentially generate textual knowledge and a corresponding response.
arXiv Detail & Related papers (2020-04-30T07:31:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.