When to Fold'em: How to answer Unanswerable questions
- URL: http://arxiv.org/abs/2105.00328v1
- Date: Sat, 1 May 2021 19:08:40 GMT
- Title: When to Fold'em: How to answer Unanswerable questions
- Authors: Marshall Ho, Zhipeng Zhou, Judith He
- Abstract summary: We present 3 different question-answering models trained on the SQuAD2.0 dataset.
We developed a novel approach capable of achieving a 2% point improvement in SQuAD2.0 F1 in reduced training time.
- Score: 5.586191108738563
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present 3 different question-answering models trained on the SQuAD2.0
dataset -- BIDAF, DocumentQA and ALBERT Retro-Reader -- demonstrating the
improvement of language models in the past three years. Through our research in
fine-tuning pre-trained models for question-answering, we developed a novel
approach capable of achieving a 2% point improvement in SQuAD2.0 F1 in reduced
training time. Our method of re-initializing select layers of a
parameter-shared language model is simple yet empirically powerful.
Related papers
- A Lightweight Method to Generate Unanswerable Questions in English [18.323248259867356]
We examine a simpler data augmentation method for unanswerable question generation in English.
We perform antonym and entity swaps on answerable questions.
Compared to the prior state-of-the-art, data generated with our training-free and lightweight strategy results in better models.
arXiv Detail & Related papers (2023-10-30T10:14:52Z) - Zero-shot Visual Question Answering with Language Model Feedback [83.65140324876536]
We propose a language model guided captioning approach, LAMOC, for knowledge-based visual question answering (VQA)
Our approach employs the generated captions by a captioning model as the context of an answer prediction model, which is a Pre-trained Language model (PLM)
arXiv Detail & Related papers (2023-05-26T15:04:20Z) - Learning Answer Generation using Supervision from Automatic Question
Answering Evaluators [98.9267570170737]
We propose a novel training paradigm for GenQA using supervision from automatic QA evaluation models (GAVA)
We evaluate our proposed methods on two academic and one industrial dataset, obtaining a significant improvement in answering accuracy over the previous state of the art.
arXiv Detail & Related papers (2023-05-24T16:57:04Z) - Knowledge Transfer from Answer Ranking to Answer Generation [97.38378660163414]
We propose to train a GenQA model by transferring knowledge from a trained AS2 model.
We also propose to use the AS2 model prediction scores for loss weighting and score-conditioned input/output shaping.
arXiv Detail & Related papers (2022-10-23T21:51:27Z) - Recitation-Augmented Language Models [85.30591349383849]
We show that RECITE is a powerful paradigm for knowledge-intensive NLP tasks.
Specifically, we show that by utilizing recitation as the intermediate step, a recite-and-answer scheme can achieve new state-of-the-art performance.
arXiv Detail & Related papers (2022-10-04T00:49:20Z) - Multilingual Answer Sentence Reranking via Automatically Translated Data [97.98885151955467]
We present a study on the design of multilingual Answer Sentence Selection (AS2) models, which are a core component of modern Question Answering (QA) systems.
The main idea is to transfer data, created from one resource rich language, e.g., English, to other languages, less rich in terms of resources.
arXiv Detail & Related papers (2021-02-20T03:52:08Z) - UnitedQA: A Hybrid Approach for Open Domain Question Answering [70.54286377610953]
We apply novel techniques to enhance both extractive and generative readers built upon recent pretrained neural language models.
Our approach outperforms previous state-of-the-art models by 3.3 and 2.7 points in exact match on NaturalQuestions and TriviaQA respectively.
arXiv Detail & Related papers (2021-01-01T06:36:16Z) - KgPLM: Knowledge-guided Language Model Pre-training via Generative and
Discriminative Learning [45.067001062192844]
We present a language model pre-training framework guided by factual knowledge completion and verification.
Experimental results on LAMA, a set of zero-shot cloze-style question answering tasks, show that our model contains richer factual knowledge than the conventional pre-trained language models.
arXiv Detail & Related papers (2020-12-07T09:39:25Z) - REALM: Retrieval-Augmented Language Model Pre-Training [37.3178586179607]
We augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia.
For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner.
We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA)
arXiv Detail & Related papers (2020-02-10T18:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.