Unified Open-Domain Question Answering with Structured and Unstructured
Knowledge
- URL: http://arxiv.org/abs/2012.14610v1
- Date: Tue, 29 Dec 2020 05:14:08 GMT
- Title: Unified Open-Domain Question Answering with Structured and Unstructured
Knowledge
- Authors: Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro
Okhonko, Michael Schlichtkrull, Sonal Gupta, Yashar Mehdad, Scott Yih
- Abstract summary: We study open-domain question answering (ODQA) with structured, unstructured and semi-structured knowledge sources.
Our approach homogenizes all sources by reducing them to text, and applies recent, powerful retriever-reader models.
As a result, our unified model produces state-of-the-art results on 3 popular ODQA benchmarks.
- Score: 7.7429684536437104
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study open-domain question answering (ODQA) with structured, unstructured
and semi-structured knowledge sources, including text, tables, lists, and
knowledge bases. Our approach homogenizes all sources by reducing them to text,
and applies recent, powerful retriever-reader models which have so far been
limited to text sources only. We show that knowledge-base QA can be greatly
improved when reformulated in this way. Contrary to previous work, we find that
combining sources always helps, even for datasets which target a single source
by construction. As a result, our unified model produces state-of-the-art
results on 3 popular ODQA benchmarks.
Related papers
- Contri(e)ve: Context + Retrieve for Scholarly Question Answering [0.0]
We present a two step solution using open source Large Language Model(LLM): Llama3.1 for Scholarly-QALD dataset.
Firstly, we extract the context pertaining to the question from different structured and unstructured data sources.
Secondly, we implement prompt engineering to improve the information retrieval performance of the LLM.
arXiv Detail & Related papers (2024-09-13T17:38:47Z) - TrustUQA: A Trustful Framework for Unified Structured Data Question Answering [45.480862651323115]
We propose UnifiedTQA, a trustful QA framework that can simultaneously support multiple types of structured data in a unified way.
We have evaluated UnifiedTQA with 5 benchmarks covering 3 types of structured data.
It outperforms 2 existing unified structured data QA methods and in comparison with the baselines that are specific to a data type, it achieves state-of-the-art on 2 of them.
arXiv Detail & Related papers (2024-06-27T06:13:05Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - Merging Generated and Retrieved Knowledge for Open-Domain QA [72.42262579925911]
COMBO is a compatibility-Oriented knowledge Merging for Better Open-domain QA framework.
We show that COMBO outperforms competitive baselines on three out of four tested open-domain QA benchmarks.
arXiv Detail & Related papers (2023-10-22T19:37:06Z) - Chain-of-Knowledge: Grounding Large Language Models via Dynamic
Knowledge Adapting over Heterogeneous Sources [87.26486246513063]
Chain-of-knowledge (CoK) is a framework that augments large language models.
CoK consists of three stages: reasoning preparation, dynamic knowledge adapting, and answer consolidation.
arXiv Detail & Related papers (2023-05-22T17:34:23Z) - Open-domain Question Answering via Chain of Reasoning over Heterogeneous
Knowledge [82.5582220249183]
We propose a novel open-domain question answering (ODQA) framework for answering single/multi-hop questions across heterogeneous knowledge sources.
Unlike previous methods that solely rely on the retriever for gathering all evidence in isolation, our intermediary performs a chain of reasoning over the retrieved set.
Our system achieves competitive performance on two ODQA datasets, OTT-QA and NQ, against tables and passages from Wikipedia.
arXiv Detail & Related papers (2022-10-22T03:21:32Z) - Open Domain Question Answering over Virtual Documents: A Unified
Approach for Data and Text [62.489652395307914]
We use the data-to-text method as a means for encoding structured knowledge for knowledge-intensive applications, i.e. open-domain question answering (QA)
Specifically, we propose a verbalizer-retriever-reader framework for open-domain QA over data and text where verbalized tables from Wikipedia and triples from Wikidata are used as augmented knowledge sources.
We show that our Unified Data and Text QA, UDT-QA, can effectively benefit from the expanded knowledge index, leading to large gains over text-only baselines.
arXiv Detail & Related papers (2021-10-16T00:11:21Z) - KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT)
All tasks in KILT are grounded in the same snapshot of Wikipedia.
We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.