A Dataset of Fact-Checked Images Shared on WhatsApp During the Brazilian
and Indian Elections
- URL: http://arxiv.org/abs/2005.02443v1
- Date: Tue, 5 May 2020 19:08:26 GMT
- Title: A Dataset of Fact-Checked Images Shared on WhatsApp During the Brazilian
and Indian Elections
- Authors: Julio C. S. Reis, Philipe de Freitas Melo, Kiran Garimella, Jussara M.
Almeida, Dean Eckles, Fabr\'icio Benevenuto
- Abstract summary: A notable form of abuse in WhatsApp relies on several manipulated images and memes containing all kinds of fake stories.
This paper opens a novel dataset to the research community containing fact-checked fake images shared through WhatsApp.
- Score: 4.512596331783666
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, messaging applications, such as WhatsApp, have been reportedly
abused by misinformation campaigns, especially in Brazil and India. A notable
form of abuse in WhatsApp relies on several manipulated images and memes
containing all kinds of fake stories. In this work, we performed an extensive
data collection from a large set of WhatsApp publicly accessible groups and
fact-checking agency websites. This paper opens a novel dataset to the research
community containing fact-checked fake images shared through WhatsApp for two
distinct scenarios known for the spread of fake news on the platform: the 2018
Brazilian elections and the 2019 Indian elections.
Related papers
- WhatsApp Explorer: A Data Donation Tool To Facilitate Research on WhatsApp [1.2507543279181124]
This paper introduces WhatsApp Explorer, a tool designed to enable WhatsApp data collection on a large scale.
We discuss protocols for data collection, including potential sampling approaches, and explain why our tool (and adjoining protocol) arguably allow researchers to collect WhatsApp data in an ethical and legal manner, at scale.
arXiv Detail & Related papers (2024-03-29T13:30:29Z) - Analyzing Misinformation Claims During the 2022 Brazilian General
Election on WhatsApp, Twitter, and Kwai [6.571720922953704]
This study analyzes misinformation from WhatsApp, Twitter, and Kwai during the 2022 Brazilian general election.
Given the democratic importance of accurate information during elections, multiple fact-checking organizations collaborated to identify and respond to misinformation via WhatsApp tiplines.
Our research highlights the limitations of current claim matching algorithms to match claims across platforms with such differences.
arXiv Detail & Related papers (2024-01-04T18:18:32Z) - Helping Fact-Checkers Identify Fake News Stories Shared through Images
on WhatsApp [1.5678677448474552]
We propose a "fakeness score" model as a means to help fact-checking agencies identify fake news stories shared through images on WhatsApp.
Our experimental evaluation shows that this tool can reduce by up to 40% the amount of effort required to identify 80% of the fake news in the data.
arXiv Detail & Related papers (2023-08-28T16:12:29Z) - TGDataset: a Collection of Over One Hundred Thousand Telegram Channels [69.22187804798162]
This paper presents the TGDataset, a new dataset that includes 120,979 Telegram channels and over 400 million messages.
We analyze the languages spoken within our dataset and the topic covered by English channels.
In addition to the raw dataset, we released the scripts we used to analyze the dataset and the list of channels belonging to the network of a new conspiracy theory called Sabmyk.
arXiv Detail & Related papers (2023-03-09T15:42:38Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - Uncovering the Dark Side of Telegram: Fakes, Clones, Scams, and
Conspiracy Movements [67.39353554498636]
We perform a large-scale analysis of Telegram by collecting 35,382 different channels and over 130,000,000 messages.
We find some of the infamous activities also present on privacy-preserving services of the Dark Web, such as carding.
We propose a machine learning model that is able to identify fake channels with an accuracy of 86%.
arXiv Detail & Related papers (2021-11-26T14:53:31Z) - User Preference-aware Fake News Detection [61.86175081368782]
Existing fake news detection algorithms focus on mining news content for deceptive signals.
We propose a new framework, UPFD, which simultaneously captures various signals from user preferences by joint content and graph modeling.
arXiv Detail & Related papers (2021-04-25T21:19:24Z) - Factorization of Fact-Checks for Low Resource Indian Languages [44.94080515860928]
We introduce FactDRIL: the first large scale multilingual Fact-checking dataset for Regional Indian languages.
Our dataset consists of 9,058 samples belonging to English, 5,155 samples to Hindi and remaining 8,222 samples are distributed across various regional languages.
We expect this dataset will be a valuable resource and serve as a starting point to fight proliferation of fake news in low resource languages.
arXiv Detail & Related papers (2021-02-23T16:47:41Z) - Can WhatsApp Benefit from Debunked Fact-Checked Stories to Reduce
Misinformation? [3.116035935327534]
We observe that misinformation has been largely shared on WhatsApp public groups even after they were already fact-checked by popular fact-checking agencies.
This represents a significant portion of misinformation spread in both Brazil and India in the groups analyzed.
We propose an architecture that could be implemented by WhatsApp to counter such misinformation.
arXiv Detail & Related papers (2020-06-03T18:28:57Z) - Images and Misinformation in Political Groups: Evidence from WhatsApp in
India [6.421670116083633]
We study a large collection of politically-oriented WhatsApp groups in India, focusing on the period leading up to the 2019 Indian national elections.
By labeling samples of random and popular images, we find that around 13% of shared images are known misinformation.
Machine learning methods can be used to predict whether a viral image is misinformation, but are brittle to shifts in content over time.
arXiv Detail & Related papers (2020-05-19T23:00:17Z) - HoaxItaly: a collection of Italian disinformation and fact-checking
stories shared on Twitter in 2019 [72.96986027203377]
The dataset includes also title and body for approximately 37k news articles.
It is publicly available at https://doi.org/10.79DVN/ PGVDHX.
arXiv Detail & Related papers (2020-01-29T16:14:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.