Multi-Sentence Knowledge Selection in Open-Domain Dialogue
- URL: http://arxiv.org/abs/2203.00763v1
- Date: Tue, 1 Mar 2022 22:07:05 GMT
- Title: Multi-Sentence Knowledge Selection in Open-Domain Dialogue
- Authors: Mihail Eric, Nicole Chartier, Behnam Hedayatnia, Karthik
Gopalakrishnan, Pankaj Rajan, Yang Liu, Dilek Hakkani-Tur
- Abstract summary: We evaluate the existing state of open-domain conversation knowledge selection.
We create an augmented dataset based on the Wizard of Wikipedia (WOW) corpus.
WOW++ averages 8 relevant knowledge sentences per dialogue context.
- Score: 11.936691632841388
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Incorporating external knowledge sources effectively in conversations is a
longstanding problem in open-domain dialogue research. The existing literature
on open-domain knowledge selection is limited and makes certain brittle
assumptions on knowledge sources to simplify the overall task (Dinan et al.,
2019), such as the existence of a single relevant knowledge sentence per
context. In this work, we evaluate the existing state of open-domain
conversation knowledge selection, showing where the existing methodologies
regarding data and evaluation are flawed. We then improve on them by proposing
a new framework for collecting relevant knowledge, and create an augmented
dataset based on the Wizard of Wikipedia (WOW) corpus, which we call WOW++.
WOW++ averages 8 relevant knowledge sentences per dialogue context, embracing
the inherent ambiguity of open-domain dialogue knowledge selection. We then
benchmark various knowledge ranking algorithms on this augmented dataset with
both intrinsic evaluation and extrinsic measures of response quality, showing
that neural rerankers that use WOW++ can outperform rankers trained on standard
datasets.
Related papers
- SOK-Bench: A Situated Video Reasoning Benchmark with Aligned Open-World Knowledge [60.76719375410635]
We propose a new benchmark (SOK-Bench) consisting of 44K questions and 10K situations with instance-level annotations depicted in the videos.
The reasoning process is required to understand and apply situated knowledge and general knowledge for problem-solving.
We generate associated question-answer pairs and reasoning processes, finally followed by manual reviews for quality assurance.
arXiv Detail & Related papers (2024-05-15T21:55:31Z) - Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts [83.57864140378035]
This paper proposes a method to cover longer contexts in Open-Domain Question-Answering tasks.
It leverages a small encoder language model that effectively encodes contexts, and the encoding applies cross-attention with origin inputs.
After fine-tuning, there is improved performance across two held-in datasets, four held-out datasets, and also in two In Context Learning settings.
arXiv Detail & Related papers (2024-04-02T15:10:11Z) - A Knowledge Plug-and-Play Test Bed for Open-domain Dialogue Generation [51.31429493814664]
We present a benchmark named multi-source Wizard of Wikipedia for evaluating multi-source dialogue knowledge selection and response generation.
We propose a new challenge, dialogue knowledge plug-and-play, which aims to test an already trained dialogue model on using new support knowledge from previously unseen sources.
arXiv Detail & Related papers (2024-03-06T06:54:02Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - SINC: Service Information Augmented Open-Domain Conversation [46.912064636311825]
We propose a knowledge-driven dialogue system using dynamic service information.
We release the first open domain Chinese service knowledge dialogue dataset DuSinc.
Both automatic evaluation and human evaluation show that our proposed new method can significantly improve the effect of open-domain conversation.
arXiv Detail & Related papers (2022-06-28T13:41:48Z) - Enhanced Knowledge Selection for Grounded Dialogues via Document
Semantic Graphs [123.50636090341236]
We propose to automatically convert background knowledge documents into document semantic graphs.
Our document semantic graphs preserve sentence-level information through the use of sentence nodes and provide concept connections between sentences.
Our experiments show that our semantic graph-based knowledge selection improves over sentence selection baselines for both the knowledge selection task and the end-to-end response generation task on HollE.
arXiv Detail & Related papers (2022-06-15T04:51:32Z) - Commonsense and Named Entity Aware Knowledge Grounded Dialogue
Generation [20.283091595536835]
We present a novel open-domain dialogue generation model which effectively utilizes the large-scale commonsense and named entity based knowledge.
Our proposed model utilizes a multi-hop attention layer to preserve the most accurate and critical parts of the dialogue history and the associated knowledge.
Empirical results on two benchmark dataset demonstrate that our model significantly outperforms the state-of-the-art methods in terms of both automatic evaluation metrics and human judgment.
arXiv Detail & Related papers (2022-05-27T12:11:40Z) - Achieving Conversational Goals with Unsupervised Post-hoc Knowledge
Injection [37.15893335147598]
A limitation of current neural dialog models is that they tend to suffer from a lack of specificity and informativeness in generated responses.
We propose a post-hoc knowledge-injection technique where we first retrieve a diverse set of relevant knowledge snippets conditioned on both the dialog history and an initial response from an existing dialog model.
We construct multiple candidate responses, individually injecting each retrieved snippet into the initial response using a gradient-based decoding method, and then select the final response with an unsupervised ranking step.
arXiv Detail & Related papers (2022-03-22T00:42:27Z) - Open Domain Question Answering over Virtual Documents: A Unified
Approach for Data and Text [62.489652395307914]
We use the data-to-text method as a means for encoding structured knowledge for knowledge-intensive applications, i.e. open-domain question answering (QA)
Specifically, we propose a verbalizer-retriever-reader framework for open-domain QA over data and text where verbalized tables from Wikipedia and triples from Wikidata are used as augmented knowledge sources.
We show that our Unified Data and Text QA, UDT-QA, can effectively benefit from the expanded knowledge index, leading to large gains over text-only baselines.
arXiv Detail & Related papers (2021-10-16T00:11:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.