Related papers: Unified Open-Domain Question Answering with Structured and Unstructured Knowledge

Unified Open-Domain Question Answering with Structured and Unstructured Knowledge

URL: http://arxiv.org/abs/2012.14610v1
Date: Tue, 29 Dec 2020 05:14:08 GMT
Title: Unified Open-Domain Question Answering with Structured and Unstructured Knowledge
Authors: Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Schlichtkrull, Sonal Gupta, Yashar Mehdad, Scott Yih
Abstract summary: We study open-domain question answering (ODQA) with structured, unstructured and semi-structured knowledge sources. Our approach homogenizes all sources by reducing them to text, and applies recent, powerful retriever-reader models. As a result, our unified model produces state-of-the-art results on 3 popular ODQA benchmarks.
Score: 7.7429684536437104
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study open-domain question answering (ODQA) with structured, unstructured and semi-structured knowledge sources, including text, tables, lists, and knowledge bases. Our approach homogenizes all sources by reducing them to text, and applies recent, powerful retriever-reader models which have so far been limited to text sources only. We show that knowledge-base QA can be greatly improved when reformulated in this way. Contrary to previous work, we find that combining sources always helps, even for datasets which target a single source by construction. As a result, our unified model produces state-of-the-art results on 3 popular ODQA benchmarks.

Related papers

GenKI: Enhancing Open-Domain Question Answering with Knowledge Integration and Controllable Generation in Large Language Models [75.25348392263676]
Open-domain question answering (OpenQA) represents a cornerstone in natural language processing (NLP)<n>We propose a novel framework named GenKI, which aims to improve the OpenQA performance by exploring Knowledge Integration and controllable Generation.
arXiv Detail & Related papers (2025-05-26T08:18:33Z)
Focus, Merge, Rank: Improved Question Answering Based on Semi-structured Knowledge Bases [2.6524539020042663]
We present FocusedRetriever, a modular SKB-based framework for multi-hop question answering.<n>It integrates components (VSS-based entity search, LLM-based generation of Cypher queries and pairwise re-ranking) in a way that enables it to outperform state-of-the-art methods.<n>The average first-hit rate exceeds that of the second-best method by 25.7%.
arXiv Detail & Related papers (2025-05-14T09:35:56Z)
Contri(e)ve: Context + Retrieve for Scholarly Question Answering [0.0]
We present a two step solution using open source Large Language Model(LLM): Llama3.1 for Scholarly-QALD dataset. Firstly, we extract the context pertaining to the question from different structured and unstructured data sources. Secondly, we implement prompt engineering to improve the information retrieval performance of the LLM.
arXiv Detail & Related papers (2024-09-13T17:38:47Z)
TrustUQA: A Trustful Framework for Unified Structured Data Question Answering [45.480862651323115]
We propose UnifiedTQA, a trustful QA framework that can simultaneously support multiple types of structured data in a unified way. We have evaluated UnifiedTQA with 5 benchmarks covering 3 types of structured data. It outperforms 2 existing unified structured data QA methods and in comparison with the baselines that are specific to a data type, it achieves state-of-the-art on 2 of them.
arXiv Detail & Related papers (2024-06-27T06:13:05Z)
DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge. Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z)
Merging Generated and Retrieved Knowledge for Open-Domain QA [72.42262579925911]
COMBO is a compatibility-Oriented knowledge Merging for Better Open-domain QA framework. We show that COMBO outperforms competitive baselines on three out of four tested open-domain QA benchmarks.
arXiv Detail & Related papers (2023-10-22T19:37:06Z)
Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources [87.26486246513063]
Chain-of-knowledge (CoK) is a framework that augments large language models. CoK consists of three stages: reasoning preparation, dynamic knowledge adapting, and answer consolidation.
arXiv Detail & Related papers (2023-05-22T17:34:23Z)
Open-domain Question Answering via Chain of Reasoning over Heterogeneous Knowledge [82.5582220249183]
We propose a novel open-domain question answering (ODQA) framework for answering single/multi-hop questions across heterogeneous knowledge sources. Unlike previous methods that solely rely on the retriever for gathering all evidence in isolation, our intermediary performs a chain of reasoning over the retrieved set. Our system achieves competitive performance on two ODQA datasets, OTT-QA and NQ, against tables and passages from Wikipedia.
arXiv Detail & Related papers (2022-10-22T03:21:32Z)
Open Domain Question Answering over Virtual Documents: A Unified Approach for Data and Text [62.489652395307914]
We use the data-to-text method as a means for encoding structured knowledge for knowledge-intensive applications, i.e. open-domain question answering (QA) Specifically, we propose a verbalizer-retriever-reader framework for open-domain QA over data and text where verbalized tables from Wikipedia and triples from Wikidata are used as augmented knowledge sources. We show that our Unified Data and Text QA, UDT-QA, can effectively benefit from the expanded knowledge index, leading to large gains over text-only baselines.
arXiv Detail & Related papers (2021-10-16T00:11:21Z)
KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT) All tasks in KILT are grounded in the same snapshot of Wikipedia. We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.