Self-Teaching Machines to Read and Comprehend with Large-Scale
Multi-Subject Question Answering Data
- URL: http://arxiv.org/abs/2102.01226v1
- Date: Mon, 1 Feb 2021 23:18:58 GMT
- Title: Self-Teaching Machines to Read and Comprehend with Large-Scale
Multi-Subject Question Answering Data
- Authors: Dian Yu, Kai Sun, Dong Yu, Claire Cardie
- Abstract summary: It is unclear whether subject-area question-answering data is useful for machine reading comprehension tasks.
We collect a large-scale multi-subject multiple-choice question-answering dataset, ExamQA.
We use incomplete and noisy snippets returned by a web search engine as the relevant context for each question-answering instance to convert it into a weakly-labeled MRC instance.
- Score: 58.36305373100518
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In spite of much recent research in the area, it is still unclear whether
subject-area question-answering data is useful for machine reading
comprehension (MRC) tasks. In this paper, we investigate this question. We
collect a large-scale multi-subject multiple-choice question-answering dataset,
ExamQA, and use incomplete and noisy snippets returned by a web search engine
as the relevant context for each question-answering instance to convert it into
a weakly-labeled MRC instance. We then propose a self-teaching paradigm to
better use the generated weakly-labeled MRC instances to improve a target MRC
task. Experimental results show that we can obtain an improvement of 5.1% in
accuracy on a multiple-choice MRC dataset, C^3, demonstrating the effectiveness
of our framework and the usefulness of large-scale subject-area
question-answering data for machine reading comprehension.
Related papers
- RAG-ConfusionQA: A Benchmark for Evaluating LLMs on Confusing Questions [52.33835101586687]
Conversational AI agents use Retrieval Augmented Generation (RAG) to provide verifiable document-grounded responses to user inquiries.
This paper presents a novel synthetic data generation method to efficiently create a diverse set of context-grounded confusing questions from a given document corpus.
arXiv Detail & Related papers (2024-10-18T16:11:29Z) - Optimization of Retrieval-Augmented Generation Context with Outlier Detection [0.0]
We focus on methods to reduce the size and improve the quality of the prompt context required for question-answering systems.
Our goal is to select the most semantically relevant documents, treating the discarded ones as outliers.
It was found that the greatest improvements were achieved with increasing complexity of the questions and answers.
arXiv Detail & Related papers (2024-07-01T15:53:29Z) - Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation [9.390902237835457]
We propose a new method to measure the task-specific accuracy of Retrieval-Augmented Large Language Models (RAG)
Evaluation is performed by scoring the RAG on an automatically-generated synthetic exam composed of multiple choice questions.
arXiv Detail & Related papers (2024-05-22T13:14:11Z) - A Dataset of Open-Domain Question Answering with Multiple-Span Answers [11.291635421662338]
Multi-span answer extraction, also known as the task of multi-span question answering (MSQA), is critical for real-world applications.
There is a notable lack of publicly available MSQA benchmark in Chinese.
We present CLEAN, a comprehensive Chinese multi-span question answering dataset.
arXiv Detail & Related papers (2024-02-15T13:03:57Z) - UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large Models [55.22048505787125]
This paper contributes a comprehensive dataset, called UNK-VQA.
We first augment the existing data via deliberate perturbations on either the image or question.
We then extensively evaluate the zero- and few-shot performance of several emerging multi-modal large models.
arXiv Detail & Related papers (2023-10-17T02:38:09Z) - Answer Span Correction in Machine Reading Comprehension [16.82391374339153]
Machine reading comprehension (MRC) consists of verifying an extracted answer against an input context and question pair.
Previous work has looked at re-assessing the "answerability" of the question given the extracted answer.
Here we address the tendency of existing MRC systems to produce partially correct answers when presented with answerable questions.
arXiv Detail & Related papers (2020-11-06T15:31:07Z) - Tell Me How to Ask Again: Question Data Augmentation with Controllable
Rewriting in Continuous Space [94.8320535537798]
Controllable Rewriting based Question Data Augmentation (CRQDA) for machine reading comprehension (MRC), question generation, and question-answering natural language inference tasks.
We treat the question data augmentation task as a constrained question rewriting problem to generate context-relevant, high-quality, and diverse question data samples.
arXiv Detail & Related papers (2020-10-04T03:13:46Z) - ClarQ: A large-scale and diverse dataset for Clarification Question
Generation [67.1162903046619]
We devise a novel bootstrapping framework that assists in the creation of a diverse, large-scale dataset of clarification questions based on postcomments extracted from stackexchange.
We quantitatively demonstrate the utility of the newly created dataset by applying it to the downstream task of question-answering.
We release this dataset in order to foster research into the field of clarification question generation with the larger goal of enhancing dialog and question answering systems.
arXiv Detail & Related papers (2020-06-10T17:56:50Z) - Improving Multi-Turn Response Selection Models with Complementary
Last-Utterance Selection by Instance Weighting [84.9716460244444]
We consider utilizing the underlying correlation in the data resource itself to derive different kinds of supervision signals.
We conduct extensive experiments in two public datasets and obtain significant improvement in both datasets.
arXiv Detail & Related papers (2020-02-18T06:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.