Transformer-Based Models for Question Answering on COVID19
- URL: http://arxiv.org/abs/2101.11432v1
- Date: Sat, 16 Jan 2021 23:06:30 GMT
- Title: Transformer-Based Models for Question Answering on COVID19
- Authors: Hillary Ngai, Yoona Park, John Chen and Mahboobeh Parsapoor (Mah
Parsa)
- Abstract summary: We propose three transformer-based question-answering systems using BERT, ALBERT, and T5 models.
The BERT-based QA system achieved the highest F1 score (26.32), while the ALBERT-based QA system achieved the highest Exact Match (13.04)
- Score: 4.631723879329972
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In response to the Kaggle's COVID-19 Open Research Dataset (CORD-19)
challenge, we have proposed three transformer-based question-answering systems
using BERT, ALBERT, and T5 models. Since the CORD-19 dataset is unlabeled, we
have evaluated the question-answering models' performance on two labeled
questions answers datasets \textemdash CovidQA and CovidGQA. The BERT-based QA
system achieved the highest F1 score (26.32), while the ALBERT-based QA system
achieved the highest Exact Match (13.04). However, numerous challenges are
associated with developing high-performance question-answering systems for the
ongoing COVID-19 pandemic and future pandemics. At the end of this paper, we
discuss these challenges and suggest potential solutions to address them.
Related papers
- UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large Models [55.22048505787125]
This paper contributes a comprehensive dataset, called UNK-VQA.
We first augment the existing data via deliberate perturbations on either the image or question.
We then extensively evaluate the zero- and few-shot performance of several emerging multi-modal large models.
arXiv Detail & Related papers (2023-10-17T02:38:09Z) - An Empirical Comparison of LM-based Question and Answer Generation
Methods [79.31199020420827]
Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context.
In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning.
Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches.
arXiv Detail & Related papers (2023-05-26T14:59:53Z) - QUADRo: Dataset and Models for QUestion-Answer Database Retrieval [97.84448420852854]
Given a database (DB) of question/answer (q/a) pairs, it is possible to answer a target question by scanning the DB for similar questions.
We build a large scale DB of 6.3M q/a pairs, using public questions, and design a new system based on neural IR and a q/a pair reranker.
We show that our DB-based approach is competitive with Web-based methods, i.e., a QA system built on top the BING search engine.
arXiv Detail & Related papers (2023-03-30T00:42:07Z) - Answer Generation for Questions With Multiple Information Sources in
E-Commerce [0.0]
We propose a novel pipeline (MSQAP) that utilizes the rich information present in the aforementioned sources by separately performing relevancy and ambiguity prediction.
This is the first work in the e-commerce domain that automatically generates natural language answers combining the information present in diverse sources such as specifications, similar questions, and reviews data.
arXiv Detail & Related papers (2021-11-27T23:19:49Z) - COVIDRead: A Large-scale Question Answering Dataset on COVID-19 [41.23094507923245]
We present a very important resource, COVIDRead, a Stanford Question Answering dataset (SQuAD) like dataset over more than 100k question-answer pairs.
This is a precious resource that could serve many purposes, ranging from common people queries regarding this very uncommon disease to managing articles by editors/associate editors of a journal.
We establish several end-to-end neural network based baseline models that attain the lowest F1 of 32.03% and the highest F1 of 37.19%.
arXiv Detail & Related papers (2021-10-05T07:38:06Z) - Improving Unsupervised Question Answering via Summarization-Informed
Question Generation [47.96911338198302]
Question Generation (QG) is the task of generating a plausible question for a passage, answer> pair.
We make use of freely available news summary data, transforming declarative sentences into appropriate questions using dependency parsing, named entity recognition and semantic role labeling.
The resulting questions are then combined with the original news articles to train an end-to-end neural QG model.
arXiv Detail & Related papers (2021-09-16T13:08:43Z) - Will this Question be Answered? Question Filtering via Answer Model
Distillation for Efficient Question Answering [99.66470885217623]
We propose a novel approach towards improving the efficiency of Question Answering (QA) systems by filtering out questions that will not be answered by them.
This is based on an interesting new finding: the answer confidence scores of state-of-the-art QA systems can be approximated well by models solely using the input question text.
arXiv Detail & Related papers (2021-09-14T23:07:49Z) - A Clarifying Question Selection System from NTES_ALONG in Convai3
Challenge [8.656503175492375]
This paper presents the participation of NetEase Game AI Lab team for the ClariQ challenge at Search-oriented Conversational AI (SCAI) EMNLP workshop in 2020.
The challenge asks for a complete conversational information retrieval system that can understanding and generating clarification questions.
We propose a clarifying question selection system which consists of response understanding, candidate question recalling and clarifying question ranking.
arXiv Detail & Related papers (2020-10-27T11:22:53Z) - Summary-Oriented Question Generation for Informational Queries [23.72999724312676]
We aim to produce self-explanatory questions that focus on main document topics and are answerable with variable length passages as appropriate.
Our model shows SOTA performance of SQ generation on the NQ dataset (20.1 BLEU-4).
We further apply our model on out-of-domain news articles, evaluating with a QA system due to the lack of gold questions and demonstrate that our model produces better SQs for news articles -- with further confirmation via a human evaluation.
arXiv Detail & Related papers (2020-10-19T17:30:08Z) - What Are People Asking About COVID-19? A Question Classification Dataset [56.609360198598914]
We present COVID-Q, a set of 1,690 questions about COVID-19 from 13 sources.
The most common questions in our dataset asked about transmission, prevention, and societal effects of COVID.
Many questions that appeared in multiple sources were not answered by any FAQ websites of reputable organizations such as the CDC and FDA.
arXiv Detail & Related papers (2020-05-26T05:41:58Z) - Harvesting and Refining Question-Answer Pairs for Unsupervised QA [95.9105154311491]
We introduce two approaches to improve unsupervised Question Answering (QA)
First, we harvest lexically and syntactically divergent questions from Wikipedia to automatically construct a corpus of question-answer pairs (named as RefQA)
Second, we take advantage of the QA model to extract more appropriate answers, which iteratively refines data over RefQA.
arXiv Detail & Related papers (2020-05-06T15:56:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.