Answering Open-Domain Questions of Varying Reasoning Steps from Text
- URL: http://arxiv.org/abs/2010.12527v4
- Date: Fri, 29 Oct 2021 15:12:46 GMT
- Title: Answering Open-Domain Questions of Varying Reasoning Steps from Text
- Authors: Peng Qi, Haejun Lee, Oghenetegiri "TG" Sido, Christopher D. Manning
- Abstract summary: We develop a unified system to answer directly from text open-domain questions.
We employ a single multi-task transformer model to perform all the necessary subtasks.
We show that our model demonstrates competitive performance on both existing benchmarks and this new benchmark.
- Score: 39.48011017748654
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop a unified system to answer directly from text open-domain
questions that may require a varying number of retrieval steps. We employ a
single multi-task transformer model to perform all the necessary subtasks --
retrieving supporting facts, reranking them, and predicting the answer from all
retrieved documents -- in an iterative fashion. We avoid crucial assumptions of
previous work that do not transfer well to real-world settings, including
exploiting knowledge of the fixed number of retrieval steps required to answer
each question or using structured metadata like knowledge bases or web links
that have limited availability. Instead, we design a system that can answer
open-domain questions on any text collection without prior knowledge of
reasoning complexity. To emulate this setting, we construct a new benchmark,
called BeerQA, by combining existing one- and two-step datasets with a new
collection of 530 questions that require three Wikipedia pages to answer,
unifying Wikipedia corpora versions in the process. We show that our model
demonstrates competitive performance on both existing benchmarks and this new
benchmark. We make the new benchmark available at https://beerqa.github.io/.
Related papers
- Hierarchical Retrieval-Augmented Generation Model with Rethink for Multi-hop Question Answering [24.71247954169364]
Multi-hop Question Answering (QA) necessitates complex reasoning by integrating multiple pieces of information to resolve intricate questions.
Existing QA systems encounter challenges such as outdated information, context window length limitations, and an accuracy-quantity trade-off.
We propose a novel framework, the Hierarchical Retrieval-Augmented Generation Model with Rethink (HiRAG), comprising Decomposer, Definer, Retriever, Filter, and Summarizer five key modules.
arXiv Detail & Related papers (2024-08-20T09:29:31Z) - QUADRo: Dataset and Models for QUestion-Answer Database Retrieval [97.84448420852854]
Given a database (DB) of question/answer (q/a) pairs, it is possible to answer a target question by scanning the DB for similar questions.
We build a large scale DB of 6.3M q/a pairs, using public questions, and design a new system based on neural IR and a q/a pair reranker.
We show that our DB-based approach is competitive with Web-based methods, i.e., a QA system built on top the BING search engine.
arXiv Detail & Related papers (2023-03-30T00:42:07Z) - Open-domain Question Answering via Chain of Reasoning over Heterogeneous
Knowledge [82.5582220249183]
We propose a novel open-domain question answering (ODQA) framework for answering single/multi-hop questions across heterogeneous knowledge sources.
Unlike previous methods that solely rely on the retriever for gathering all evidence in isolation, our intermediary performs a chain of reasoning over the retrieved set.
Our system achieves competitive performance on two ODQA datasets, OTT-QA and NQ, against tables and passages from Wikipedia.
arXiv Detail & Related papers (2022-10-22T03:21:32Z) - Generate rather than Retrieve: Large Language Models are Strong Context
Generators [74.87021992611672]
We present a novel perspective for solving knowledge-intensive tasks by replacing document retrievers with large language model generators.
We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.
arXiv Detail & Related papers (2022-09-21T01:30:59Z) - Adaptive Information Seeking for Open-Domain Question Answering [61.39330982757494]
We propose a novel adaptive information-seeking strategy for open-domain question answering, namely AISO.
According to the learned policy, AISO could adaptively select a proper retrieval action to seek the missing evidence at each step.
AISO outperforms all baseline methods with predefined strategies in terms of both retrieval and answer evaluations.
arXiv Detail & Related papers (2021-09-14T15:08:13Z) - MK-SQuIT: Synthesizing Questions using Iterative Template-filling [0.0]
We create a framework for synthetically generating question/query pairs with as little human input as possible.
These datasets can be used to train machine translation systems to convert natural language questions into queries.
arXiv Detail & Related papers (2020-11-04T22:33:05Z) - Answering Any-hop Open-domain Questions with Iterative Document
Reranking [62.76025579681472]
We propose a unified QA framework to answer any-hop open-domain questions.
Our method consistently achieves performance comparable to or better than the state-of-the-art on both single-hop and multi-hop open-domain QA datasets.
arXiv Detail & Related papers (2020-09-16T04:31:38Z) - Revisiting the Open-Domain Question Answering Pipeline [0.23204178451683266]
This paper describes Mindstone, an open-domain QA system that consists of a new multi-stage pipeline.
We show how the new pipeline enables the use of low-resolution labels, and can be easily tuned to meet various timing requirements.
arXiv Detail & Related papers (2020-09-02T09:34:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.