Choose Your QA Model Wisely: A Systematic Study of Generative and
Extractive Readers for Question Answering
- URL: http://arxiv.org/abs/2203.07522v1
- Date: Mon, 14 Mar 2022 22:07:52 GMT
- Title: Choose Your QA Model Wisely: A Systematic Study of Generative and
Extractive Readers for Question Answering
- Authors: Man Luo, Kazuma Hashimoto, Semih Yavuz, Zhiwei Liu, Chitta Baral,
Yingbo Zhou
- Abstract summary: We make the first attempt to systematically study the comparison of extractive and generative readers for question answering.
To be aligned with the state-of-the-art, we explore nine transformer-based large pre-trained language models (PrLMs) as backbone architectures.
- Score: 41.14425610319543
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While both extractive and generative readers have been successfully applied
to the Question Answering (QA) task, little attention has been paid toward the
systematic comparison of them. Characterizing the strengths and weaknesses of
the two readers is crucial not only for making a more informed reader selection
in practice but also for developing a deeper understanding to foster further
research on improving readers in a principled manner. Motivated by this goal,
we make the first attempt to systematically study the comparison of extractive
and generative readers for question answering. To be aligned with the
state-of-the-art, we explore nine transformer-based large pre-trained language
models (PrLMs) as backbone architectures. Furthermore, we organize our findings
under two main categories: (1) keeping the architecture invariant, and (2)
varying the underlying PrLMs. Among several interesting findings, it is
important to highlight that (1) the generative readers perform better in long
context QA, (2) the extractive readers perform better in short context while
also showing better out-of-domain generalization, and (3) the encoder of
encoder-decoder PrLMs (e.g., T5) turns out to be a strong extractive reader and
outperforms the standard choice of encoder-only PrLMs (e.g., RoBERTa). We also
study the effect of multi-task learning on the two types of readers varying the
underlying PrLMs and perform qualitative and quantitative diagnosis to provide
further insights into future directions in modeling better readers.
Related papers
- Ruffle&Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System [21.139850269835858]
Conversational tutoring systems (CTSs) offer learning experiences through interactions based on natural language.
We discuss and evaluate a novel type of CTS that leverages recent advances in large language models (LLMs) in two ways.
The system enables AI-assisted content authoring by inducing an easily editable tutoring script automatically from a lesson text.
arXiv Detail & Related papers (2024-04-26T14:57:55Z) - An In-depth Survey of Large Language Model-based Artificial Intelligence
Agents [11.774961923192478]
We have explored the core differences and characteristics between LLM-based AI agents and traditional AI agents.
We conducted an in-depth analysis of the key components of AI agents, including planning, memory, and tool use.
arXiv Detail & Related papers (2023-09-23T11:25:45Z) - Re-Reading Improves Reasoning in Large Language Models [87.46256176508376]
We introduce a simple, yet general and effective prompting method, Re2, to enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs)
Unlike most thought-eliciting prompting methods, such as Chain-of-Thought (CoT), Re2 shifts the focus to the input by processing questions twice, thereby enhancing the understanding process.
We evaluate Re2 on extensive reasoning benchmarks across 14 datasets, spanning 112 experiments, to validate its effectiveness and generality.
arXiv Detail & Related papers (2023-09-12T14:36:23Z) - Pre-trained Embeddings for Entity Resolution: An Experimental Analysis
[Experiment, Analysis & Benchmark] [65.11858854040544]
We perform a thorough experimental analysis of 12 popular language models over 17 established benchmark datasets.
First, we assess their vectorization overhead for converting all input entities into dense embeddings vectors.
Second, we investigate their blocking performance, performing a detailed scalability analysis, and comparing them with the state-of-the-art deep learning-based blocking method.
Third, we conclude with their relative performance for both supervised and unsupervised matching.
arXiv Detail & Related papers (2023-04-24T08:53:54Z) - elBERto: Self-supervised Commonsense Learning for Question Answering [131.51059870970616]
We propose a Self-supervised Bidirectional Representation Learning of Commonsense framework, which is compatible with off-the-shelf QA model architectures.
The framework comprises five self-supervised tasks to force the model to fully exploit the additional training signals from contexts containing rich commonsense.
elBERto achieves substantial improvements on out-of-paragraph and no-effect questions where simple lexical similarity comparison does not help.
arXiv Detail & Related papers (2022-03-17T16:23:45Z) - Enhanced Seq2Seq Autoencoder via Contrastive Learning for Abstractive
Text Summarization [15.367455931848252]
We present a sequence-to-sequence (seq2seq) autoencoder via contrastive learning for abstractive text summarization.
Our model adopts a standard Transformer-based architecture with a multi-layer bi-directional encoder and an auto-regressive decoder.
We conduct experiments on two datasets and demonstrate that our model outperforms many existing benchmarks.
arXiv Detail & Related papers (2021-08-26T18:45:13Z) - UnitedQA: A Hybrid Approach for Open Domain Question Answering [70.54286377610953]
We apply novel techniques to enhance both extractive and generative readers built upon recent pretrained neural language models.
Our approach outperforms previous state-of-the-art models by 3.3 and 2.7 points in exact match on NaturalQuestions and TriviaQA respectively.
arXiv Detail & Related papers (2021-01-01T06:36:16Z) - Retrospective Reader for Machine Reading Comprehension [90.6069071495214]
Machine reading comprehension (MRC) is an AI challenge that requires machine to determine the correct answers to questions based on a given passage.
When unanswerable questions are involved in the MRC task, an essential verification module called verifier is especially required in addition to the encoder.
This paper devotes itself to exploring better verifier design for the MRC task with unanswerable questions.
arXiv Detail & Related papers (2020-01-27T11:14:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.