More Than Reading Comprehension: A Survey on Datasets and Metrics of
Textual Question Answering
- URL: http://arxiv.org/abs/2109.12264v1
- Date: Sat, 25 Sep 2021 02:36:53 GMT
- Title: More Than Reading Comprehension: A Survey on Datasets and Metrics of
Textual Question Answering
- Authors: Yang Bai, Daisy Zhe Wang
- Abstract summary: Textual Question Answering (QA) aims to provide precise answers to user's questions in natural language using unstructured data.
In this paper, we survey 47 recent textual QA benchmark datasets and propose a new taxonomy from an application point of view.
- Score: 7.776227936353711
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Textual Question Answering (QA) aims to provide precise answers to user's
questions in natural language using unstructured data. One of the most popular
approaches to this goal is machine reading comprehension(MRC). In recent years,
many novel datasets and evaluation metrics based on classical MRC tasks have
been proposed for broader textual QA tasks. In this paper, we survey 47 recent
textual QA benchmark datasets and propose a new taxonomy from an application
point of view. In addition, We summarize 8 evaluation metrics of textual QA
tasks. Finally, we discuss current trends in constructing textual QA benchmarks
and suggest directions for future work.
Related papers
- AMAQA: A Metadata-based QA Dataset for RAG Systems [7.882922366782987]
We present AMAQA, a new open-access QA dataset designed to evaluate tasks combining text and metadata.<n>AMAQA includes about 1.1 million English messages collected from 26 public Telegram groups.<n>We show that leveraging metadata boosts accuracy from 0.12 to 0.61, highlighting the value of structured context.
arXiv Detail & Related papers (2025-05-19T08:59:08Z) - PeerQA: A Scientific Question Answering Dataset from Peer Reviews [51.95579001315713]
We present PeerQA, a real-world, scientific, document-level Question Answering dataset.
The dataset contains 579 QA pairs from 208 academic articles, with a majority from ML and NLP.
We provide a detailed analysis of the collected dataset and conduct experiments establishing baseline systems for all three tasks.
arXiv Detail & Related papers (2025-02-19T12:24:46Z) - Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books.
Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z) - Fully Authentic Visual Question Answering Dataset from Online Communities [72.0524198499719]
Visual Question Answering (VQA) entails answering questions about images.
We introduce the first VQA dataset in which all contents originate from an authentic use case.
We characterize this dataset and how it relates to eight mainstream VQA datasets.
arXiv Detail & Related papers (2023-11-27T06:19:00Z) - Diversity Enhanced Narrative Question Generation for Storybooks [4.043005183192124]
We introduce a multi-question generation model (mQG) capable of generating multiple, diverse, and answerable questions.
To validate the answerability of the generated questions, we employ a SQuAD2.0 fine-tuned question answering model.
mQG shows promising results across various evaluation metrics, among strong baselines.
arXiv Detail & Related papers (2023-10-25T08:10:04Z) - Towards Complex Document Understanding By Discrete Reasoning [77.91722463958743]
Document Visual Question Answering (VQA) aims to understand visually-rich documents to answer questions in natural language.
We introduce a new Document VQA dataset, named TAT-DQA, which consists of 3,067 document pages and 16,558 question-answer pairs.
We develop a novel model named MHST that takes into account the information in multi-modalities, including text, layout and visual image, to intelligently address different types of questions.
arXiv Detail & Related papers (2022-07-25T01:43:19Z) - Modern Question Answering Datasets and Benchmarks: A Survey [5.026863544662493]
Question Answering (QA) is one of the most important natural language processing (NLP) tasks.
It aims using NLP technologies to generate a corresponding answer to a given question based on the massive unstructured corpus.
In this paper, we investigate influential QA datasets that have been released in the era of deep learning.
arXiv Detail & Related papers (2022-06-30T05:53:56Z) - AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer
Summarization [73.91543616777064]
Community Question Answering (CQA) fora such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of community-based questions.
One goal of answer summarization is to produce a summary that reflects the range of answer perspectives.
This work introduces a novel dataset of 4,631 CQA threads for answer summarization, curated by professional linguists.
arXiv Detail & Related papers (2021-11-11T21:48:02Z) - SituatedQA: Incorporating Extra-Linguistic Contexts into QA [7.495151447459443]
We introduce SituatedQA, an open-retrieval QA dataset where systems must produce the correct answer to a question given the temporal or geographical context.
We find that a significant proportion of information seeking questions have context-dependent answers.
Our study shows that existing models struggle with producing answers that are frequently updated or from uncommon locations.
arXiv Detail & Related papers (2021-09-13T17:53:21Z) - TSQA: Tabular Scenario Based Question Answering [14.92495213480887]
scenario-based question answering (SQA) has attracted an increasing research interest.
To support the study of this task, we construct GeoTSQA.
We extend state-of-the-art MRC methods with TTGen, a novel table-to-text generator.
arXiv Detail & Related papers (2021-01-14T02:00:33Z) - Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document.
We show that readers engage in a series of pragmatic strategies to seek information.
We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z) - Towards Question-Answering as an Automatic Metric for Evaluating the
Content Quality of a Summary [65.37544133256499]
We propose a metric to evaluate the content quality of a summary using question-answering (QA)
We demonstrate the experimental benefits of QA-based metrics through an analysis of our proposed metric, QAEval.
arXiv Detail & Related papers (2020-10-01T15:33:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.