QAMPARI: An Open-domain Question Answering Benchmark for Questions with
Many Answers from Multiple Paragraphs
- URL: http://arxiv.org/abs/2205.12665v4
- Date: Mon, 29 May 2023 06:16:41 GMT
- Title: QAMPARI: An Open-domain Question Answering Benchmark for Questions with
Many Answers from Multiple Paragraphs
- Authors: Samuel Joseph Amouyal, Tomer Wolfson, Ohad Rubin, Ori Yoran, Jonathan
Herzig, Jonathan Berant
- Abstract summary: We introduce QAMPARI, an ODQA benchmark where question answers are lists of entities, spread across many paragraphs.
We create QAMPARI by (a) generating questions with multiple answers from Wikipedia's knowledge graph and tables, (b) automatically pairing answers with supporting evidence in Wikipedia paragraphs, and (c) manually paraphrasing questions and validating each answer.
We train ODQA models from the retrieve-and-read family and find that QAMPARI is challenging in terms of both passage retrieval and answer generation, reaching an F1 score of 32.8 at best.
- Score: 42.843866049949206
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing benchmarks for open-domain question answering (ODQA) typically focus
on questions whose answers can be extracted from a single paragraph. By
contrast, many natural questions, such as "What players were drafted by the
Brooklyn Nets?" have a list of answers. Answering such questions requires
retrieving and reading from many passages, in a large corpus. We introduce
QAMPARI, an ODQA benchmark, where question answers are lists of entities,
spread across many paragraphs. We created QAMPARI by (a) generating questions
with multiple answers from Wikipedia's knowledge graph and tables, (b)
automatically pairing answers with supporting evidence in Wikipedia paragraphs,
and (c) manually paraphrasing questions and validating each answer. We train
ODQA models from the retrieve-and-read family and find that QAMPARI is
challenging in terms of both passage retrieval and answer generation, reaching
an F1 score of 32.8 at best. Our results highlight the need for developing ODQA
models that handle a broad range of question types, including single and
multi-answer questions.
Related papers
- Auto FAQ Generation [0.0]
We propose a system for generating FAQ documents that extract the salient questions and their corresponding answers from sizeable text documents.
We use existing text summarization, sentence ranking via the Text rank algorithm, and question-generation tools to create an initial set of questions and answers.
arXiv Detail & Related papers (2024-05-13T03:30:27Z) - Long-form Question Answering: An Iterative Planning-Retrieval-Generation
Approach [28.849548176802262]
Long-form question answering (LFQA) poses a challenge as it involves generating detailed answers in the form of paragraphs.
We propose an LFQA model with iterative Planning, Retrieval, and Generation.
We find that our model outperforms the state-of-the-art models on various textual and factual metrics for the LFQA task.
arXiv Detail & Related papers (2023-11-15T21:22:27Z) - Answering Ambiguous Questions with a Database of Questions, Answers, and
Revisions [95.92276099234344]
We present a new state-of-the-art for answering ambiguous questions that exploits a database of unambiguous questions generated from Wikipedia.
Our method improves performance by 15% on recall measures and 10% on measures which evaluate disambiguating questions from predicted outputs.
arXiv Detail & Related papers (2023-08-16T20:23:16Z) - Discourse Comprehension: A Question Answering Framework to Represent
Sentence Connections [35.005593397252746]
A key challenge in building and evaluating models for discourse comprehension is the lack of annotated data.
This paper presents a novel paradigm that enables scalable data collection targeting the comprehension of news documents.
The resulting corpus, DCQA, consists of 22,430 question-answer pairs across 607 English documents.
arXiv Detail & Related papers (2021-11-01T04:50:26Z) - ConditionalQA: A Complex Reading Comprehension Dataset with Conditional
Answers [93.55268936974971]
We describe a Question Answering dataset that contains complex questions with conditional answers.
We call this dataset ConditionalQA.
We show that ConditionalQA is challenging for many of the existing QA models, especially in selecting answer conditions.
arXiv Detail & Related papers (2021-10-13T17:16:46Z) - GooAQ: Open Question Answering with Diverse Answer Types [63.06454855313667]
We present GooAQ, a large-scale dataset with a variety of answer types.
This dataset contains over 5 million questions and 3 million answers collected from Google.
arXiv Detail & Related papers (2021-04-18T05:40:39Z) - Answering Any-hop Open-domain Questions with Iterative Document
Reranking [62.76025579681472]
We propose a unified QA framework to answer any-hop open-domain questions.
Our method consistently achieves performance comparable to or better than the state-of-the-art on both single-hop and multi-hop open-domain QA datasets.
arXiv Detail & Related papers (2020-09-16T04:31:38Z) - Unsupervised Question Decomposition for Question Answering [102.56966847404287]
We propose an algorithm for One-to-N Unsupervised Sequence Sequence (ONUS) that learns to map one hard, multi-hop question to many simpler, single-hop sub-questions.
We show large QA improvements on HotpotQA over a strong baseline on the original, out-of-domain, and multi-hop dev sets.
arXiv Detail & Related papers (2020-02-22T19:40:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.