Merging Generated and Retrieved Knowledge for Open-Domain QA
- URL: http://arxiv.org/abs/2310.14393v1
- Date: Sun, 22 Oct 2023 19:37:06 GMT
- Title: Merging Generated and Retrieved Knowledge for Open-Domain QA
- Authors: Yunxiang Zhang, Muhammad Khalifa, Lajanugen Logeswaran, Moontae Lee,
Honglak Lee, Lu Wang
- Abstract summary: COMBO is a compatibility-Oriented knowledge Merging for Better Open-domain QA framework.
We show that COMBO outperforms competitive baselines on three out of four tested open-domain QA benchmarks.
- Score: 72.42262579925911
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Open-domain question answering (QA) systems are often built with retrieval
modules. However, retrieving passages from a given source is known to suffer
from insufficient knowledge coverage. Alternatively, prompting large language
models (LLMs) to generate contextual passages based on their parametric
knowledge has been shown to improve QA performance. Yet, LLMs tend to
"hallucinate" content that conflicts with the retrieved knowledge. Based on the
intuition that answers supported by both sources are more likely to be correct,
we propose COMBO, a Compatibility-Oriented knowledge Merging for Better
Open-domain QA framework, to effectively leverage the two sources of
information. Concretely, we match LLM-generated passages with retrieved
counterparts into compatible pairs, based on discriminators trained with silver
compatibility labels. Then a Fusion-in-Decoder-based reader model handles
passage pairs to arrive at the final answer. Experiments show that COMBO
outperforms competitive baselines on three out of four tested open-domain QA
benchmarks. Further analysis reveals that our proposed framework demonstrates
greater efficacy in scenarios with a higher degree of knowledge conflicts.
Related papers
- ZEBRA: Zero-Shot Example-Based Retrieval Augmentation for Commonsense Question Answering [46.04261413492061]
ZEBRA is a zero-shot question answering framework that combines retrieval, case-based reasoning and introspection.
Given an input question, ZEBRA retrieves relevant question-knowledge pairs from a knowledge base and generates new knowledge by reasoning over the relationships in these pairs.
This generated knowledge is then used to answer the input question, improving the model's performance and interpretability.
arXiv Detail & Related papers (2024-10-07T14:31:43Z) - Enhancing Contextual Understanding in Large Language Models through Contrastive Decoding [9.2433070542025]
Large language models (LLMs) tend to inadequately integrate input context during text generation.
We introduce a novel approach integrating contrastive decoding with adversarial irrelevant passages as negative samples.
arXiv Detail & Related papers (2024-05-04T20:38:41Z) - REAR: A Relevance-Aware Retrieval-Augmented Framework for Open-Domain Question Answering [115.72130322143275]
REAR is a RElevance-Aware Retrieval-augmented approach for open-domain question answering (QA)
We develop a novel architecture for LLM-based RAG systems, by incorporating a specially designed assessment module.
Experiments on four open-domain QA tasks show that REAR significantly outperforms previous a number of competitive RAG approaches.
arXiv Detail & Related papers (2024-02-27T13:22:51Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - Knowledge-Driven CoT: Exploring Faithful Reasoning in LLMs for
Knowledge-intensive Question Answering [17.672572064705445]
Large language models (LLMs) equipped with Chain-of-Thought (CoT) have shown impressive reasoning ability in various downstream tasks.
We propose a framework called Knowledge-Driven Chain-of-Thought (KD-CoT) to verify and modify reasoning traces in CoT via interaction with external knowledge.
arXiv Detail & Related papers (2023-08-25T09:23:55Z) - Open-domain Question Answering via Chain of Reasoning over Heterogeneous
Knowledge [82.5582220249183]
We propose a novel open-domain question answering (ODQA) framework for answering single/multi-hop questions across heterogeneous knowledge sources.
Unlike previous methods that solely rely on the retriever for gathering all evidence in isolation, our intermediary performs a chain of reasoning over the retrieved set.
Our system achieves competitive performance on two ODQA datasets, OTT-QA and NQ, against tables and passages from Wikipedia.
arXiv Detail & Related papers (2022-10-22T03:21:32Z) - Multifaceted Improvements for Conversational Open-Domain Question
Answering [54.913313912927045]
We propose a framework with Multifaceted Improvements for Conversational open-domain Question Answering (MICQA)
Firstly, the proposed KL-divergence based regularization is able to lead to a better question understanding for retrieval and answer reading.
Second, the added post-ranker module can push more relevant passages to the top placements and be selected for reader with a two-aspect constrains.
Third, the well designed curriculum learning strategy effectively narrows the gap between the golden passage settings of training and inference, and encourages the reader to find true answer without the golden passage assistance.
arXiv Detail & Related papers (2022-04-01T07:54:27Z) - Tradeoffs in Sentence Selection Techniques for Open-Domain Question
Answering [54.541952928070344]
We describe two groups of models for sentence selection: QA-based approaches, which run a full-fledged QA system to identify answer candidates, and retrieval-based models, which find parts of each passage specifically related to each question.
We show that very lightweight QA models can do well at this task, but retrieval-based models are faster still.
arXiv Detail & Related papers (2020-09-18T23:39:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.