Related papers: Revisiting the Open-Domain Question Answering Pipeline

Revisiting the Open-Domain Question Answering Pipeline

URL: http://arxiv.org/abs/2009.00914v1
Date: Wed, 2 Sep 2020 09:34:14 GMT
Title: Revisiting the Open-Domain Question Answering Pipeline
Authors: Sina J. Semnani, Manish Pandey
Abstract summary: This paper describes Mindstone, an open-domain QA system that consists of a new multi-stage pipeline. We show how the new pipeline enables the use of low-resolution labels, and can be easily tuned to meet various timing requirements.
Score: 0.23204178451683266
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Open-domain question answering (QA) is the tasl of identifying answers to natural questions from a large corpus of documents. The typical open-domain QA system starts with information retrieval to select a subset of documents from the corpus, which are then processed by a machine reader to select the answer spans. This paper describes Mindstone, an open-domain QA system that consists of a new multi-stage pipeline that employs a traditional BM25-based information retriever, RM3-based neural relevance feedback, neural ranker, and a machine reading comprehension stage. This paper establishes a new baseline for end-to-end performance on question answering for Wikipedia/SQuAD dataset (EM=58.1, F1=65.8), with substantial gains over the previous state of the art (Yang et al., 2019b). We also show how the new pipeline enables the use of low-resolution labels, and can be easily tuned to meet various timing requirements.

Related papers

Open-domain Question Answering via Chain of Reasoning over Heterogeneous Knowledge [82.5582220249183]
We propose a novel open-domain question answering (ODQA) framework for answering single/multi-hop questions across heterogeneous knowledge sources. Unlike previous methods that solely rely on the retriever for gathering all evidence in isolation, our intermediary performs a chain of reasoning over the retrieved set. Our system achieves competitive performance on two ODQA datasets, OTT-QA and NQ, against tables and passages from Wikipedia.
arXiv Detail & Related papers (2022-10-22T03:21:32Z)
Generate rather than Retrieve: Large Language Models are Strong Context Generators [74.87021992611672]
We present a novel perspective for solving knowledge-intensive tasks by replacing document retrievers with large language model generators. We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.
arXiv Detail & Related papers (2022-09-21T01:30:59Z)
Zero-Shot Open-Book Question Answering [0.0]
This article proposes a solution for answering natural language questions from technical documents with no domain-specific labeled data (zero-shot) We are introducing a new test dataset for open-book QA based on real customer questions on AWS technical documentation. We were able to achieve 49% F1 and 39% exact score (EM) end-to-end with no domain-specific training.
arXiv Detail & Related papers (2021-11-22T20:38:41Z)
ComQA:Compositional Question Answering via Hierarchical Graph Neural Networks [47.12013005600986]
We present a large-scale compositional question answering dataset containing more than 120k human-labeled questions. To tackle the ComQA problem, we proposed a hierarchical graph neural networks, which represents the document from the low-level word to the high-level sentence. Our proposed model achieves a significant improvement over previous machine reading comprehension methods and pre-training methods.
arXiv Detail & Related papers (2021-01-16T08:23:27Z)
Retrieving and Reading: A Comprehensive Survey on Open-domain Question Answering [62.88322725956294]
We review the latest research trends in OpenQA, with particular attention to systems that incorporate neural MRC techniques. We introduce modern OpenQA architecture named Retriever-Reader'' and analyze the various systems that follow this architecture. We then discuss key challenges to developing OpenQA systems and offer an analysis of benchmarks that are commonly used.
arXiv Detail & Related papers (2021-01-04T04:47:46Z)
Answering Open-Domain Questions of Varying Reasoning Steps from Text [39.48011017748654]
We develop a unified system to answer directly from text open-domain questions. We employ a single multi-task transformer model to perform all the necessary subtasks. We show that our model demonstrates competitive performance on both existing benchmarks and this new benchmark.
arXiv Detail & Related papers (2020-10-23T16:51:09Z)
Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question. Most open QA systems have considered only retrieving information from unstructured text. We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z)
Answering Any-hop Open-domain Questions with Iterative Document Reranking [62.76025579681472]
We propose a unified QA framework to answer any-hop open-domain questions. Our method consistently achieves performance comparable to or better than the state-of-the-art on both single-hop and multi-hop open-domain QA datasets.
arXiv Detail & Related papers (2020-09-16T04:31:38Z)
Knowledge-Aided Open-Domain Question Answering [58.712857964048446]
We propose a knowledge-aided open-domain QA (KAQA) method which targets at improving relevant document retrieval and answer reranking. During document retrieval, a candidate document is scored by considering its relationship to the question and other documents. During answer reranking, a candidate answer is reranked using not only its own context but also the clues from other documents.
arXiv Detail & Related papers (2020-06-09T13:28:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.