Attention-guided Generative Models for Extractive Question Answering
- URL: http://arxiv.org/abs/2110.06393v1
- Date: Tue, 12 Oct 2021 23:02:35 GMT
- Title: Attention-guided Generative Models for Extractive Question Answering
- Authors: Peng Xu, Davis Liang, Zhiheng Huang, Bing Xiang
- Abstract summary: Recently, pretrained generative sequence-to-sequence (seq2seq) models have achieved great success in question answering.
We propose a simple strategy to obtain an extractive answer span from the generative model by leveraging the decoder cross-attention patterns.
- Score: 17.476450946279037
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel method for applying Transformer models to extractive
question answering (QA) tasks. Recently, pretrained generative
sequence-to-sequence (seq2seq) models have achieved great success in question
answering. Contributing to the success of these models are internal attention
mechanisms such as cross-attention. We propose a simple strategy to obtain an
extractive answer span from the generative model by leveraging the decoder
cross-attention patterns. Viewing cross-attention as an architectural prior, we
apply joint training to further improve QA performance. Empirical results show
that on open-domain question answering datasets like NaturalQuestions and
TriviaQA, our method approaches state-of-the-art performance on both generative
and extractive inference, all while using much fewer parameters. Furthermore,
this strategy allows us to perform hallucination-free inference while
conferring significant improvements to the model's ability to rerank relevant
passages.
Related papers
- Towards Robust Extractive Question Answering Models: Rethinking the Training Methodology [0.34530027457862006]
Previous research has shown that existing models, when trained on EQA datasets that include unanswerable questions, demonstrate a significant lack of robustness.
Our proposed training method includes a novel loss function for the EQA problem and challenges an implicit assumption present in numerous EQA datasets.
Our models exhibit significantly enhanced robustness against two types of adversarial attacks, with a performance decrease of only about a third compared to the default models.
arXiv Detail & Related papers (2024-09-29T20:35:57Z) - Adapting Pre-trained Generative Models for Extractive Question Answering [4.993041970406846]
We introduce a novel approach that uses the power of pre-trained generative models to address extractive QA tasks.
We demonstrate the superior performance of our proposed approach compared to existing state-of-the-art models.
arXiv Detail & Related papers (2023-11-06T09:01:02Z) - Jaeger: A Concatenation-Based Multi-Transformer VQA Model [0.13654846342364307]
Document-based Visual Question Answering poses a challenging task between linguistic sense disambiguation and fine-grained multimodal retrieval.
We propose Jaegar, a concatenation-based multi-transformer VQA model.
Our approach has the potential to amplify the performance of these models through concatenation.
arXiv Detail & Related papers (2023-10-11T00:14:40Z) - An Empirical Comparison of LM-based Question and Answer Generation
Methods [79.31199020420827]
Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context.
In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning.
Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches.
arXiv Detail & Related papers (2023-05-26T14:59:53Z) - Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue
Response Generation Models by Causal Discovery [52.95935278819512]
We conduct the first study on spurious correlations for open-domain response generation models based on a corpus CGDIALOG curated in our work.
Inspired by causal discovery algorithms, we propose a novel model-agnostic method for training and inference of response generation model.
arXiv Detail & Related papers (2023-03-02T06:33:48Z) - Tokenization Consistency Matters for Generative Models on Extractive NLP
Tasks [54.306234256074255]
We identify the issue of tokenization inconsistency that is commonly neglected in training generative models.
This issue damages the extractive nature of these tasks after the input and output are tokenized inconsistently.
We show that, with consistent tokenization, the model performs better in both in-domain and out-of-domain datasets.
arXiv Detail & Related papers (2022-12-19T23:33:21Z) - You Only Need One Model for Open-domain Question Answering [26.582284346491686]
Recent works for Open-domain Question Answering refer to an external knowledge base using a retriever model.
We propose casting the retriever and the reranker as hard-attention mechanisms applied sequentially within the transformer architecture.
We evaluate our model on Natural Questions and TriviaQA open datasets and our model outperforms the previous state-of-the-art model by 1.0 and 0.7 exact match scores.
arXiv Detail & Related papers (2021-12-14T13:21:11Z) - UnitedQA: A Hybrid Approach for Open Domain Question Answering [70.54286377610953]
We apply novel techniques to enhance both extractive and generative readers built upon recent pretrained neural language models.
Our approach outperforms previous state-of-the-art models by 3.3 and 2.7 points in exact match on NaturalQuestions and TriviaQA respectively.
arXiv Detail & Related papers (2021-01-01T06:36:16Z) - Leveraging Passage Retrieval with Generative Models for Open Domain
Question Answering [61.394478670089065]
Generative models for open domain question answering have proven to be competitive, without resorting to external knowledge.
We investigate how much these models can benefit from retrieving text passages, potentially containing evidence.
We observe that the performance of this method significantly improves when increasing the number of retrieved passages.
arXiv Detail & Related papers (2020-07-02T17:44:57Z) - Harvesting and Refining Question-Answer Pairs for Unsupervised QA [95.9105154311491]
We introduce two approaches to improve unsupervised Question Answering (QA)
First, we harvest lexically and syntactically divergent questions from Wikipedia to automatically construct a corpus of question-answer pairs (named as RefQA)
Second, we take advantage of the QA model to extract more appropriate answers, which iteratively refines data over RefQA.
arXiv Detail & Related papers (2020-05-06T15:56:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.