X-FACT: A New Benchmark Dataset for Multilingual Fact Checking
- URL: http://arxiv.org/abs/2106.09248v1
- Date: Thu, 17 Jun 2021 05:09:54 GMT
- Title: X-FACT: A New Benchmark Dataset for Multilingual Fact Checking
- Authors: Ashim Gupta and Vivek Srikumar
- Abstract summary: We introduce X-FACT: the largest publicly available multilingual dataset for factual verification of naturally existing real-world claims.
The dataset contains short statements in 25 languages and is labeled for veracity by expert fact-checkers.
- Score: 21.2633064526968
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we introduce X-FACT: the largest publicly available
multilingual dataset for factual verification of naturally existing real-world
claims. The dataset contains short statements in 25 languages and is labeled
for veracity by expert fact-checkers. The dataset includes a multilingual
evaluation benchmark that measures both out-of-domain generalization, and
zero-shot capabilities of the multilingual models. Using state-of-the-art
multilingual transformer-based models, we develop several automated
fact-checking models that, along with textual claims, make use of additional
metadata and evidence from news stories retrieved using a search engine.
Empirically, our best model attains an F-score of around 40%, suggesting that
our dataset is a challenging benchmark for evaluation of multilingual
fact-checking models.
Related papers
- The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants [80.4837840962273]
We present Belebele, a dataset spanning 122 language variants.
This dataset enables the evaluation of text models in high-, medium-, and low-resource languages.
arXiv Detail & Related papers (2023-08-31T17:43:08Z) - An Open Dataset and Model for Language Identification [84.15194457400253]
We present a LID model which achieves a macro-average F1 score of 0.93 and a false positive rate of 0.033 across 201 languages.
We make both the model and the dataset available to the research community.
arXiv Detail & Related papers (2023-05-23T08:43:42Z) - XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented
Languages [105.54207724678767]
Data scarcity is a crucial issue for the development of highly multilingual NLP systems.
We propose XTREME-UP, a benchmark defined by its focus on the scarce-data scenario rather than zero-shot.
XTREME-UP evaluates the capabilities of language models across 88 under-represented languages over 9 key user-centric technologies.
arXiv Detail & Related papers (2023-05-19T18:00:03Z) - mFACE: Multilingual Summarization with Factual Consistency Evaluation [79.60172087719356]
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets.
Despite promising results, current models still suffer from generating factually inconsistent summaries.
We leverage factual consistency evaluation models to improve multilingual summarization.
arXiv Detail & Related papers (2022-12-20T19:52:41Z) - Multi-lingual Evaluation of Code Generation Models [82.7357812992118]
We present new benchmarks on evaluation code generation models: MBXP and Multilingual HumanEval, and MathQA-X.
These datasets cover over 10 programming languages.
We are able to assess the performance of code generation models in a multi-lingual fashion.
arXiv Detail & Related papers (2022-10-26T17:17:06Z) - mLUKE: The Power of Entity Representations in Multilingual Pretrained
Language Models [15.873069955407406]
We train a multilingual language model with 24 languages with entity representations.
We show the model consistently outperforms word-based pretrained models in various cross-lingual transfer tasks.
We also evaluate the model with a multilingual cloze prompt task with the mLAMA dataset.
arXiv Detail & Related papers (2021-10-15T15:28:38Z) - A Multilingual Bag-of-Entities Model for Zero-Shot Cross-Lingual Text
Classification [16.684856745734944]
We present a multilingual bag-of-entities model that boosts the performance of zero-shot cross-lingual text classification.
It leverages the multilingual nature of Wikidata: entities in multiple languages representing the same concept are defined with a unique identifier.
A model trained on entity features in a resource-rich language can thus be directly applied to other languages.
arXiv Detail & Related papers (2021-10-15T01:10:50Z) - XL-WiC: A Multilingual Benchmark for Evaluating Semantic
Contextualization [98.61159823343036]
We present the Word-in-Context dataset (WiC) for assessing the ability to correctly model distinct meanings of a word.
We put forward a large multilingual benchmark, XL-WiC, featuring gold standards in 12 new languages.
Experimental results show that even when no tagged instances are available for a target language, models trained solely on the English data can attain competitive performance.
arXiv Detail & Related papers (2020-10-13T15:32:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.