HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions
- URL: http://arxiv.org/abs/2012.15534v1
- Date: Thu, 31 Dec 2020 10:36:01 GMT
- Title: HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions
- Authors: Shaobo Li, Xiaoguang Li, Lifeng Shang, Xin Jiang, Qun Liu, Chengjie
Sun, Zhenzhou Ji, Bingquan Liu
- Abstract summary: We build HopRetriever which retrieves hops over Wikipedia to answer complex questions.
Our approach also yields quantifiable interpretations of the evidence collection process.
- Score: 38.89150764309989
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Collecting supporting evidence from large corpora of text (e.g., Wikipedia)
is of great challenge for open-domain Question Answering (QA). Especially, for
multi-hop open-domain QA, scattered evidence pieces are required to be gathered
together to support the answer extraction. In this paper, we propose a new
retrieval target, hop, to collect the hidden reasoning evidence from Wikipedia
for complex question answering. Specifically, the hop in this paper is defined
as the combination of a hyperlink and the corresponding outbound link document.
The hyperlink is encoded as the mention embedding which models the structured
knowledge of how the outbound link entity is mentioned in the textual context,
and the corresponding outbound link document is encoded as the document
embedding representing the unstructured knowledge within it. Accordingly, we
build HopRetriever which retrieves hops over Wikipedia to answer complex
questions. Experiments on the HotpotQA dataset demonstrate that HopRetriever
outperforms previously published evidence retrieval methods by large margins.
Moreover, our approach also yields quantifiable interpretations of the evidence
collection process.
Related papers
- Question-to-Question Retrieval for Hallucination-Free Knowledge Access: An Approach for Wikipedia and Wikidata Question Answering [0.0]
This paper introduces an approach to question answering over knowledge bases like Wikipedia and Wikidata.
We generate a comprehensive set of questions for each logical content unit using an instruction-tuned LLM.
We demonstrate its effectiveness on Wikipedia and Wikidata, including multimedia content through structured fact retrieval from Wikidata.
arXiv Detail & Related papers (2025-01-20T07:05:15Z) - HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs [9.559336828884808]
Large Language Models (LLMs) are adept at answering simple (single-hop) questions.
As the complexity of the questions increase, the performance of LLMs degrades.
Recent methods try to reduce this burden by integrating structured knowledge triples into the raw text.
We propose to use a knowledge graph (KG) that is context-aware and is distilled to contain query-relevant information.
arXiv Detail & Related papers (2024-06-10T05:22:49Z) - Decomposing Complex Queries for Tip-of-the-tongue Retrieval [72.07449449115167]
Complex queries describe content elements (e.g., book characters or events), information beyond the document text.
This retrieval setting, called tip of the tongue (TOT), is especially challenging for models reliant on lexical and semantic overlap between query and document text.
We introduce a simple yet effective framework for handling such complex queries by decomposing the query into individual clues, routing those as sub-queries to specialized retrievers, and ensembling the results.
arXiv Detail & Related papers (2023-05-24T11:43:40Z) - Open-domain Question Answering via Chain of Reasoning over Heterogeneous
Knowledge [82.5582220249183]
We propose a novel open-domain question answering (ODQA) framework for answering single/multi-hop questions across heterogeneous knowledge sources.
Unlike previous methods that solely rely on the retriever for gathering all evidence in isolation, our intermediary performs a chain of reasoning over the retrieved set.
Our system achieves competitive performance on two ODQA datasets, OTT-QA and NQ, against tables and passages from Wikipedia.
arXiv Detail & Related papers (2022-10-22T03:21:32Z) - Detect, Retrieve, Comprehend: A Flexible Framework for Zero-Shot
Document-Level Question Answering [6.224211330728391]
Researchers produce thousands of scholarly documents containing valuable technical knowledge.
Document-level question answering (QA) offers a flexible framework where human-posed questions can be adapted to extract diverse knowledge.
We present a three-stage document QA approach: text extraction from PDF; evidence retrieval from extracted texts to form well-posed contexts; and QA to extract knowledge from contexts to return high-quality answers.
arXiv Detail & Related papers (2022-10-04T23:33:52Z) - End-to-End Multihop Retrieval for Compositional Question Answering over
Long Documents [93.55268936974971]
We propose a multi-hop retrieval method, DocHopper, to answer compositional questions over long documents.
At each step, DocHopper retrieves a paragraph or sentence embedding from the document, mixes the retrieved result with the query, and updates the query for the next step.
We demonstrate that utilizing document structure in this was can largely improve question-answering and retrieval performance on long documents.
arXiv Detail & Related papers (2021-06-01T03:13:35Z) - Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval [117.07047313964773]
We propose a simple and efficient multi-hop dense retrieval approach for answering complex open-domain questions.
Our method does not require access to any corpus-specific information, such as inter-document hyperlinks or human-annotated entity markers.
Our system also yields a much better efficiency-accuracy trade-off, matching the best published accuracy on HotpotQA while being 10 times faster at inference time.
arXiv Detail & Related papers (2020-09-27T06:12:29Z) - Answering Any-hop Open-domain Questions with Iterative Document
Reranking [62.76025579681472]
We propose a unified QA framework to answer any-hop open-domain questions.
Our method consistently achieves performance comparable to or better than the state-of-the-art on both single-hop and multi-hop open-domain QA datasets.
arXiv Detail & Related papers (2020-09-16T04:31:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.