Overview of the Shared Task on Fake News Detection in Urdu at FIRE 2021
- URL: http://arxiv.org/abs/2207.05133v1
- Date: Mon, 11 Jul 2022 18:58:36 GMT
- Title: Overview of the Shared Task on Fake News Detection in Urdu at FIRE 2021
- Authors: Maaz Amjad, Sabur Butt, Hamza Imam Amjad, Alisa Zhila, Grigori
Sidorov, Alexander Gelbukh
- Abstract summary: The goal of the shared task is to motivate the community to come up with efficient methods for solving this vital problem.
The training set contains 1300 annotated news articles -- 750 real news, 550 fake news, while the testing set contains 300 news articles -- 200 real, 100 fake news.
The best performing system obtained an F1-macro score of 0.679, which is lower than the past year's best result of 0.907 F1-macro.
- Score: 55.41644538483948
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic detection of fake news is a highly important task in the
contemporary world. This study reports the 2nd shared task called
UrduFake@FIRE2021 on identifying fake news detection in Urdu. The goal of the
shared task is to motivate the community to come up with efficient methods for
solving this vital problem, particularly for the Urdu language. The task is
posed as a binary classification problem to label a given news article as a
real or a fake news article. The organizers provide a dataset comprising news
in five domains: (i) Health, (ii) Sports, (iii) Showbiz, (iv) Technology, and
(v) Business, split into training and testing sets. The training set contains
1300 annotated news articles -- 750 real news, 550 fake news, while the testing
set contains 300 news articles -- 200 real, 100 fake news. 34 teams from 7
different countries (China, Egypt, Israel, India, Mexico, Pakistan, and UAE)
registered to participate in the UrduFake@FIRE2021 shared task. Out of those,
18 teams submitted their experimental results, and 11 of those submitted their
technical reports, which is substantially higher compared to the UrduFake
shared task in 2020 when only 6 teams submitted their technical reports. The
technical reports submitted by the participants demonstrated different data
representation techniques ranging from count-based BoW features to word vector
embeddings as well as the use of numerous machine learning algorithms ranging
from traditional SVM to various neural network architectures including
Transformers such as BERT and RoBERTa. In this year's competition, the best
performing system obtained an F1-macro score of 0.679, which is lower than the
past year's best result of 0.907 F1-macro. Admittedly, while training sets from
the past and the current years overlap to a large extent, the testing set
provided this year is completely different.
Related papers
- Findings of Factify 2: Multimodal Fake News Detection [36.34201719103715]
We present the outcome of the Factify 2 shared task, which provides a multi-modal fact verification and satire news dataset.
The data calls for a comparison based approach to the task by pairing social media claims with supporting documents, with both text and image, divided into 5 classes based on multi-modal relations.
The highest F1 score averaged for all five classes was 81.82%.
arXiv Detail & Related papers (2023-07-19T22:14:49Z) - Combination Of Convolution Neural Networks And Deep Neural Networks For
Fake News Detection [0.0]
We have described the Fake News Challenge stage #1 dataset and given an overview of the competitive attempts to build a fake news detection system.
The proposed system detects all the categories with high accuracy except the disagree category.
As a result, the system achieves up to 84.6 % accuracy, classifying it as the second ranking based on other competitive studies.
arXiv Detail & Related papers (2022-10-15T16:32:51Z) - UrduFake@FIRE2020: Shared Track on Fake News Identification in Urdu [62.6928395368204]
This paper gives the overview of the first shared task at FIRE 2020 on fake news detection in the Urdu language.
The goal is to identify fake news using a dataset composed of 900 annotated news articles for training and 400 news articles for testing.
The dataset contains news in five domains: (i) Health, (ii) Sports, (iii) Showbiz, (iv) Technology, and (v) Business.
arXiv Detail & Related papers (2022-07-25T03:46:51Z) - Overview of the Shared Task on Fake News Detection in Urdu at FIRE 2020 [62.6928395368204]
Task was posed as a binary classification task, in which the goal is to differentiate between real and fake news.
We provided a dataset divided into 900 annotated news articles for training and 400 news articles for testing.
42 teams from 6 different countries (India, China, Egypt, Germany, Pakistan, and the UK) registered for the task.
arXiv Detail & Related papers (2022-07-25T03:41:32Z) - Overview of Abusive and Threatening Language Detection in Urdu at FIRE
2021 [50.591267188664666]
We present two shared tasks of abusive and threatening language detection for the Urdu language.
We present two manually annotated datasets containing tweets labelled as (i) Abusive and Non-Abusive, and (ii) Threatening and Non-Threatening.
For both subtasks, m-Bert based transformer model showed the best performance.
arXiv Detail & Related papers (2022-07-14T07:38:13Z) - UrduFake@FIRE2021: Shared Track on Fake News Identification in Urdu [55.41644538483948]
This study reports the second shared task named as UrduFake@FIRE2021 on identifying fake news detection in Urdu language.
The proposed systems were based on various count-based features and used different classifiers as well as neural network architectures.
The gradient descent (SGD) algorithm outperformed other classifiers and achieved 0.679 F-score.
arXiv Detail & Related papers (2022-07-11T19:15:04Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.