A Knowledge Plug-and-Play Test Bed for Open-domain Dialogue Generation
- URL: http://arxiv.org/abs/2403.03496v1
- Date: Wed, 6 Mar 2024 06:54:02 GMT
- Title: A Knowledge Plug-and-Play Test Bed for Open-domain Dialogue Generation
- Authors: Xiangci Li, Linfeng Song, Lifeng Jin, Haitao Mi, Jessica Ouyang, Dong
Yu
- Abstract summary: We present a benchmark named multi-source Wizard of Wikipedia for evaluating multi-source dialogue knowledge selection and response generation.
We propose a new challenge, dialogue knowledge plug-and-play, which aims to test an already trained dialogue model on using new support knowledge from previously unseen sources.
- Score: 51.31429493814664
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge-based, open-domain dialogue generation aims to build chit-chat
systems that talk to humans using mined support knowledge. Many types and
sources of knowledge have previously been shown to be useful as support
knowledge. Even in the era of large language models, response generation
grounded in knowledge retrieved from additional up-to-date sources remains a
practically important approach. While prior work using single-source knowledge
has shown a clear positive correlation between the performances of knowledge
selection and response generation, there are no existing multi-source datasets
for evaluating support knowledge retrieval. Further, prior work has assumed
that the knowledge sources available at test time are the same as during
training. This unrealistic assumption unnecessarily handicaps models, as new
knowledge sources can become available after a model is trained. In this paper,
we present a high-quality benchmark named multi-source Wizard of Wikipedia
(Ms.WoW) for evaluating multi-source dialogue knowledge selection and response
generation. Unlike existing datasets, it contains clean support knowledge,
grounded at the utterance level and partitioned into multiple knowledge
sources. We further propose a new challenge, dialogue knowledge plug-and-play,
which aims to test an already trained dialogue model on using new support
knowledge from previously unseen sources in a zero-shot fashion.
Related papers
- Large Language Models as Source Planner for Personalized
Knowledge-grounded Dialogue [72.26474540602517]
SAFARI is a novel framework for planning, understanding, and incorporating under both supervised and unsupervised settings.
We construct a personalized knowledge-grounded dialogue dataset textittextbfKnowledge textbfBehind textbfPersona(textbfKBP)
Experimental results on the KBP dataset demonstrate that the SAFARI framework can effectively produce persona-consistent and knowledge-enhanced responses.
arXiv Detail & Related papers (2023-10-13T03:38:38Z) - The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources
in Natural Language Understanding Systems [87.3207729953778]
We evaluate state-of-the-art coreference resolution models on our dataset.
Several models struggle to reason on-the-fly over knowledge observed both at pretrain time and at inference time.
Still, even the best performing models seem to have difficulties with reliably integrating knowledge presented only at inference time.
arXiv Detail & Related papers (2022-12-15T23:26:54Z) - Joint Reasoning on Hybrid-knowledge sources for Task-Oriented Dialog [12.081212540168055]
We present a modified version of the MutliWOZ based dataset prepared by SeKnow to demonstrate how current methods have significant degradation in performance.
In line with recent work exploiting pre-trained language models, we fine-tune a BART based model using prompts for the tasks of querying knowledge sources.
We demonstrate that our model is robust to perturbations to knowledge modality (source of information) and that it can fuse information from structured as well as unstructured knowledge to generate responses.
arXiv Detail & Related papers (2022-10-13T18:49:59Z) - Knowledge-Grounded Dialogue Generation with a Unified Knowledge
Representation [78.85622982191522]
Existing systems perform poorly on unseen topics due to limited topics covered in the training data.
We present PLUG, a language model that homogenizes different knowledge sources to a unified knowledge representation.
It can achieve comparable performance with state-of-the-art methods under a fully-supervised setting.
arXiv Detail & Related papers (2021-12-15T07:11:02Z) - Improving Commonsense Question Answering by Graph-based Iterative
Retrieval over Multiple Knowledge Sources [26.256653692882715]
How to engage commonsense effectively in question answering systems is still under exploration.
We propose a novel question-answering method by integrating ConceptNet, Wikipedia, and the Cambridge Dictionary.
We use a pre-trained language model to encode the question, retrieved knowledge and choices, and propose an answer choice-aware attention mechanism.
arXiv Detail & Related papers (2020-11-05T08:50:43Z) - Question Answering over Knowledge Base using Language Model Embeddings [0.0]
This paper focuses on using a pre-trained language model for the Knowledge Base Question Answering task.
We further fine-tuned these embeddings with a two-way attention mechanism from the knowledge base to the asked question.
Our method is based on a simple Convolutional Neural Network architecture with a Multi-Head Attention mechanism to represent the asked question.
arXiv Detail & Related papers (2020-10-17T22:59:34Z) - Unsupervised Commonsense Question Answering with Self-Talk [71.63983121558843]
We propose an unsupervised framework based on self-talk as a novel alternative to commonsense tasks.
Inspired by inquiry-based discovery learning, our approach inquires language models with a number of information seeking questions.
Empirical results demonstrate that the self-talk procedure substantially improves the performance of zero-shot language model baselines.
arXiv Detail & Related papers (2020-04-11T20:43:37Z) - Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue [51.513276162736844]
We propose a sequential latent variable model as the first approach to this matter.
The model named sequential knowledge transformer (SKT) can keep track of the prior and posterior distribution over knowledge.
arXiv Detail & Related papers (2020-02-18T11:59:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.