Related papers: FQuAD2.0: French Question Answering and knowing that you know nothing

FQuAD2.0: French Question Answering and knowing that you know nothing

URL: http://arxiv.org/abs/2109.13209v1
Date: Mon, 27 Sep 2021 17:30:46 GMT
Title: FQuAD2.0: French Question Answering and knowing that you know nothing
Authors: Quentin Heinrich, Gautier Viaud, Wacim Belblidia
Abstract summary: We introduce FQuAD2.0, which extends FQuAD with 17,000+ unanswerable questions. This dataset makes it possible to train French Question Answering models with the ability of distinguishing unanswerable questions from answerable ones.
Score: 0.25782420501870296
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Question Answering, including Reading Comprehension, is one of the NLP research areas that has seen significant scientific breakthroughs over the past few years, thanks to the concomitant advances in Language Modeling. Most of these breakthroughs, however, are centered on the English language. In 2020, as a first strong initiative to bridge the gap to the French language, Illuin Technology introduced FQuAD1.1, a French Native Reading Comprehension dataset composed of 60,000+ questions and answers samples extracted from Wikipedia articles. Nonetheless, Question Answering models trained on this dataset have a major drawback: they are not able to predict when a given question has no answer in the paragraph of interest, therefore making unreliable predictions in various industrial use-cases. In the present work, we introduce FQuAD2.0, which extends FQuAD with 17,000+ unanswerable questions, annotated adversarially, in order to be similar to answerable ones. This new dataset, comprising a total of almost 80,000 questions, makes it possible to train French Question Answering models with the ability of distinguishing unanswerable questions from answerable ones. We benchmark several models with this dataset: our best model, a fine-tuned CamemBERT-large, achieves a F1 score of 82.3% on this classification task, and a F1 score of 83% on the Reading Comprehension task.

Related papers

TyDi QA-WANA: A Benchmark for Information-Seeking Question Answering in Languages of West Asia and North Africa [13.107551474252379]
We present TyDi QA-WANA, a question-answering dataset consisting of 28K examples divided among 10 language varieties of western Asia and northern Africa.<n>The data collection process was designed to elicit information-seeking questions, where the asker is genuinely curious to know the answer.
arXiv Detail & Related papers (2025-07-23T17:20:28Z)
Building a Rich Dataset to Empower the Persian Question Answering Systems [0.6138671548064356]
This dataset is called NextQuAD and has 7,515 contexts, including 23,918 questions and answers. BERT-based question answering model has been applied to this dataset using two pre-trained language models. Evaluation on the development set shows 0.95 Exact Match (EM) and 0.97 Fl_score.
arXiv Detail & Related papers (2024-12-28T16:53:25Z)
Multilingual Non-Factoid Question Answering with Answer Paragraph Selection [36.31301773167754]
This work presents MuNfQuAD, a multilingual QuAD with non-factoid questions. The dataset comprises over 578K QA pairs across 38 languages.
arXiv Detail & Related papers (2024-08-20T07:37:06Z)
Can a Multichoice Dataset be Repurposed for Extractive Question Answering? [52.28197971066953]
We repurposed the Belebele dataset (Bandarkar et al., 2023), which was designed for multiple-choice question answering (MCQA) We present annotation guidelines and a parallel EQA dataset for English and Modern Standard Arabic (MSA). Our aim is to enable others to adapt our approach for the 120+ other language variants in Belebele, many of which are deemed under-resourced.
arXiv Detail & Related papers (2024-04-26T11:46:05Z)
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering [124.16250115608604]
We present Science Question Answering (SQA), a new benchmark that consists of 21k multimodal multiple choice questions with a diverse set of science topics and annotations of their answers with corresponding lectures and explanations. We show that SQA improves the question answering performance by 1.20% in few-shot GPT-3 and 3.99% in fine-tuned UnifiedQA. Our analysis further shows that language models, similar to humans, benefit from explanations to learn from fewer data and achieve the same performance with just 40% of the data.
arXiv Detail & Related papers (2022-09-20T07:04:24Z)
Question Generation for Reading Comprehension Assessment by Modeling How and What to Ask [3.470121495099]
We study Question Generation (QG) for reading comprehension where inferential questions are critical. We propose a two-step model (HTA-WTA) that takes advantage of previous datasets. We show that the HTA-WTA model tests for strong SCRS by asking deep inferential questions.
arXiv Detail & Related papers (2022-04-06T15:52:24Z)
Read before Generate! Faithful Long Form Question Answering with Machine Reading [77.17898499652306]
Long-form question answering (LFQA) aims to generate a paragraph-length answer for a given question. We propose a new end-to-end framework that jointly models answer generation and machine reading.
arXiv Detail & Related papers (2022-03-01T10:41:17Z)
PQuAD: A Persian Question Answering Dataset [0.0]
crowdsourced reading comprehension dataset on Persian Wikipedia articles. Includes 80,000 questions along with their answers, with 25% of the questions being adversarially unanswerable. Our experiments on different state-of-the-art pre-trained contextualized language models show 74.8% Exact Match (EM) and 87.6% F1-score.
arXiv Detail & Related papers (2022-02-13T05:42:55Z)
TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance [71.76018597965378]
We build a new large-scale Question Answering dataset containing both Tabular And Textual data, named TAT-QA. We propose a novel QA model termed TAGOP, which is capable of reasoning over both tables and text.
arXiv Detail & Related papers (2021-05-17T06:12:06Z)
MultiModalQA: Complex Question Answering over Text, Tables and Images [52.25399438133274]
We present MultiModalQA: a dataset that requires joint reasoning over text, tables and images. We create MMQA using a new framework for generating complex multi-modal questions at scale. We then define a formal language that allows us to take questions that can be answered from a single modality, and combine them to generate cross-modal questions.
arXiv Detail & Related papers (2021-04-13T09:14:28Z)
IIRC: A Dataset of Incomplete Information Reading Comprehension Questions [53.3193258414806]
We present a dataset, IIRC, with more than 13K questions over paragraphs from English Wikipedia. The questions were written by crowd workers who did not have access to any of the linked documents. We follow recent modeling work on various reading comprehension datasets to construct a baseline model for this dataset.
arXiv Detail & Related papers (2020-11-13T20:59:21Z)
Challenges in Information-Seeking QA: Unanswerable Questions and Paragraph Retrieval [46.3246135936476]
We analyze why answering information-seeking queries is more challenging and where their prevalent unanswerabilities arise. Our controlled experiments suggest two headrooms -- paragraph selection and answerability prediction. We manually annotate 800 unanswerable examples across six languages on what makes them challenging to answer.
arXiv Detail & Related papers (2020-10-22T17:48:17Z)
Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document. We show that readers engage in a series of pragmatic strategies to seek information. We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z)
The Inception Team at NSURL-2019 Task 8: Semantic Question Similarity in Arabic [0.76146285961466]
This paper describes our method for the task of Semantic Question Similarity in Arabic. The aim is to build a model that is able to detect similar semantic questions in the Arabic language for the provided dataset.
arXiv Detail & Related papers (2020-04-24T19:52:40Z)
FQuAD: French Question Answering Dataset [0.4759823735082845]
We introduce the French Question Answering dataset (FQuAD) FQuAD is a French Native Reading dataset of questions and answers on a set of Wikipedia articles. We train a baseline model which achieves an F1 score of 92.2 and an exact match ratio of 82.1 on the test set.
arXiv Detail & Related papers (2020-02-14T15:23:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.