Overview of the CLEF--2021 CheckThat! Lab on Detecting Check-Worthy
Claims, Previously Fact-Checked Claims, and Fake News
- URL: http://arxiv.org/abs/2109.12987v1
- Date: Thu, 23 Sep 2021 06:10:36 GMT
- Title: Overview of the CLEF--2021 CheckThat! Lab on Detecting Check-Worthy
Claims, Previously Fact-Checked Claims, and Fake News
- Authors: Preslav Nakov, Giovanni Da San Martino, Tamer Elsayed, Alberto
Barr\'on-Cede\~no, Rub\'en M\'iguez, Shaden Shaar, Firoj Alam, Fatima
Haouari, Maram Hasanain, Watheq Mansour, Bayan Hamdan, Zien Sheikh Ali,
Nikolay Babulkov, Alex Nikolov, Gautam Kishore Shahi, Julia Maria Stru{\ss},
Thomas Mandl, Mucahid Kutlu, Yavuz Selim Kartal
- Abstract summary: We describe the fourth edition of the CheckThat! Lab, part of the 2021 Conference and Labs of the Evaluation Forum (CLEF)
The lab evaluates technology supporting tasks related to factuality, and covers Arabic, Bulgarian, English, Spanish, and Turkish.
- Score: 21.574997165145486
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We describe the fourth edition of the CheckThat! Lab, part of the 2021
Conference and Labs of the Evaluation Forum (CLEF). The lab evaluates
technology supporting tasks related to factuality, and covers Arabic,
Bulgarian, English, Spanish, and Turkish. Task 1 asks to predict which posts in
a Twitter stream are worth fact-checking, focusing on COVID-19 and politics (in
all five languages). Task 2 asks to determine whether a claim in a tweet can be
verified using a set of previously fact-checked claims (in Arabic and English).
Task 3 asks to predict the veracity of a news article and its topical domain
(in English). The evaluation is based on mean average precision or precision at
rank k for the ranking tasks, and macro-F1 for the classification tasks. This
was the most popular CLEF-2021 lab in terms of team registrations: 132 teams.
Nearly one-third of them participated: 15, 5, and 25 teams submitted official
runs for tasks 1, 2, and 3, respectively.
Related papers
- ThangDLU at #SMM4H 2024: Encoder-decoder models for classifying text data on social disorders in children and adolescents [49.00494558898933]
This paper describes our participation in Task 3 and Task 5 of the #SMM4H (Social Media Mining for Health) 2024 Workshop.
Task 3 is a multi-class classification task centered on tweets discussing the impact of outdoor environments on symptoms of social anxiety.
Task 5 involves a binary classification task focusing on tweets reporting medical disorders in children.
We applied transfer learning from pre-trained encoder-decoder models such as BART-base and T5-small to identify the labels of a set of given tweets.
arXiv Detail & Related papers (2024-04-30T17:06:20Z) - UrduFake@FIRE2021: Shared Track on Fake News Identification in Urdu [55.41644538483948]
This study reports the second shared task named as UrduFake@FIRE2021 on identifying fake news detection in Urdu language.
The proposed systems were based on various count-based features and used different classifiers as well as neural network architectures.
The gradient descent (SGD) algorithm outperformed other classifiers and achieved 0.679 F-score.
arXiv Detail & Related papers (2022-07-11T19:15:04Z) - Overview of the Shared Task on Fake News Detection in Urdu at FIRE 2021 [55.41644538483948]
The goal of the shared task is to motivate the community to come up with efficient methods for solving this vital problem.
The training set contains 1300 annotated news articles -- 750 real news, 550 fake news, while the testing set contains 300 news articles -- 200 real, 100 fake news.
The best performing system obtained an F1-macro score of 0.679, which is lower than the past year's best result of 0.907 F1-macro.
arXiv Detail & Related papers (2022-07-11T18:58:36Z) - CHEF: A Pilot Chinese Dataset for Evidence-Based Fact-Checking [55.75590135151682]
CHEF is the first CHinese Evidence-based Fact-checking dataset of 10K real-world claims.
The dataset covers multiple domains, ranging from politics to public health, and provides annotated evidence retrieved from the Internet.
arXiv Detail & Related papers (2022-06-06T09:11:03Z) - DialFact: A Benchmark for Fact-Checking in Dialogue [56.63709206232572]
We construct DialFact, a benchmark dataset of 22,245 annotated conversational claims, paired with pieces of evidence from Wikipedia.
We find that existing fact-checking models trained on non-dialogue data like FEVER fail to perform well on our task.
We propose a simple yet data-efficient solution to effectively improve fact-checking performance in dialogue.
arXiv Detail & Related papers (2021-10-15T17:34:35Z) - Overview of the CLEF-2019 CheckThat!: Automatic Identification and
Verification of Claims [26.96108180116284]
CheckThat! lab featured two tasks in two different languages: English and Arabic.
The most successful approaches to Task 1 used various neural networks and logistic regression.
Learning-to-rank was used by the highest scoring runs for subtask A.
arXiv Detail & Related papers (2021-09-25T16:08:09Z) - Findings of the NLP4IF-2021 Shared Tasks on Fighting the COVID-19
Infodemic and Censorship Detection [23.280506220186425]
We present the results of the NLP4IF-2021 shared tasks.
Ten teams submitted systems for task 1, and one team participated in task 2.
The best systems used pre-trained Transformers and ensembles.
arXiv Detail & Related papers (2021-09-23T06:38:03Z) - Accenture at CheckThat! 2020: If you say so: Post-hoc fact-checking of
claims using transformer-based models [0.0]
We introduce the strategies used by the Accenture Team for the CLEF 2020 CheckThat! Lab, Task 1, on English and Arabic.
This shared task evaluated whether a claim in social media text should be professionally fact checked.
We utilized BERT and RoBERTa models to identify claims in social media text a professional fact-checker should review.
arXiv Detail & Related papers (2020-09-05T01:44:11Z) - Overview of CheckThat! 2020: Automatic Identification and Verification
of Claims in Social Media [26.60148306714383]
We present an overview of the third edition of the CheckThat! Lab at CLEF 2020.
The lab featured five tasks in two different languages: English and Arabic.
We describe the tasks setup, the evaluation results, and a summary of the approaches used by the participants.
arXiv Detail & Related papers (2020-07-15T21:19:32Z) - TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions [91.85730323228833]
We introduce TORQUE, a new English reading comprehension benchmark built on 3.2k news with 21k human-generated questions querying temporal relationships.
Results show that RoBERTa-large snippets achieves an exact-match score of 51% on the test set of TORQUE, about 30% behind human performance.
arXiv Detail & Related papers (2020-05-01T06:29:56Z) - CheckThat! at CLEF 2020: Enabling the Automatic Identification and
Verification of Claims in Social Media [28.070608555714752]
CheckThat! proposes four complementary tasks and a related task from previous lab editions.
The evaluation is carried out using mean average precision or precision at rank k for ranking tasks, and F1 for classification tasks.
arXiv Detail & Related papers (2020-01-21T06:47:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.