Predicting Misinformation and Engagement in COVID-19 Twitter Discourse
in the First Months of the Outbreak
- URL: http://arxiv.org/abs/2012.02164v2
- Date: Wed, 23 Dec 2020 15:13:31 GMT
- Title: Predicting Misinformation and Engagement in COVID-19 Twitter Discourse
in the First Months of the Outbreak
- Authors: Mirela Silva, Fabr\'icio Ceschin, Prakash Shrestha, Christopher Brant,
Juliana Fernandes, Catia S. Silva, Andr\'e Gr\'egio, Daniela Oliveira, and
Luiz Giovanini
- Abstract summary: We examine nearly 505K COVID-19-related tweets from the initial months of the pandemic to understand misinformation as a function of bot-behavior and engagement.
We found that real users tweet both facts and misinformation, while bots tweet proportionally more misinformation.
- Score: 1.2059055685264957
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Disinformation entails the purposeful dissemination of falsehoods towards a
greater dubious agenda and the chaotic fracturing of a society. The general
public has grown aware of the misuse of social media towards these nefarious
ends, where even global public health crises have not been immune to
misinformation (deceptive content spread without intended malice). In this
paper, we examine nearly 505K COVID-19-related tweets from the initial months
of the pandemic to understand misinformation as a function of bot-behavior and
engagement. Using a correlation-based feature selection method, we selected the
11 most relevant feature subsets among over 170 features to distinguish
misinformation from facts, and to predict highly engaging misinformation tweets
about COVID-19. We achieved an average F-score of at least 72\% with ten
popular multi-class classifiers, reinforcing the relevance of the selected
features. We found that (i) real users tweet both facts and misinformation,
while bots tweet proportionally more misinformation; (ii) misinformation tweets
were less engaging than facts; (iii) the textual content of a tweet was the
most important to distinguish fact from misinformation while (iv) user account
metadata and human-like activity were most important to predict high engagement
in factual and misinformation tweets; and (v) sentiment features were not
relevant.
Related papers
- ManiTweet: A New Benchmark for Identifying Manipulation of News on Social Media [74.93847489218008]
We present a novel task, identifying manipulation of news on social media, which aims to detect manipulation in social media posts and identify manipulated or inserted information.
To study this task, we have proposed a data collection schema and curated a dataset called ManiTweet, consisting of 3.6K pairs of tweets and corresponding articles.
Our analysis demonstrates that this task is highly challenging, with large language models (LLMs) yielding unsatisfactory performance.
arXiv Detail & Related papers (2023-05-23T16:40:07Z) - Machine Learning-based Automatic Annotation and Detection of COVID-19
Fake News [8.020736472947581]
COVID-19 impacted every part of the world, although the misinformation about the outbreak traveled faster than the virus.
Existing work neglects the presence of bots that act as a catalyst in the spread.
We propose an automated approach for labeling data using verified fact-checked statements on a Twitter dataset.
arXiv Detail & Related papers (2022-09-07T13:55:59Z) - Adherence to Misinformation on Social Media Through Socio-Cognitive and
Group-Based Processes [79.79659145328856]
We argue that when misinformation proliferates, this happens because the social media environment enables adherence to misinformation.
We make the case that polarization and misinformation adherence are closely tied.
arXiv Detail & Related papers (2022-06-30T12:34:24Z) - News consumption and social media regulations policy [70.31753171707005]
We analyze two social media that enforced opposite moderation methods, Twitter and Gab, to assess the interplay between news consumption and content regulation.
Our results show that the presence of moderation pursued by Twitter produces a significant reduction of questionable content.
The lack of clear regulation on Gab results in the tendency of the user to engage with both types of content, showing a slight preference for the questionable ones which may account for a dissing/endorsement behavior.
arXiv Detail & Related papers (2021-06-07T19:26:32Z) - Misinfo Belief Frames: A Case Study on Covid & Climate News [49.979419711713795]
We propose a formalism for understanding how readers perceive the reliability of news and the impact of misinformation.
We introduce the Misinfo Belief Frames (MBF) corpus, a dataset of 66k inferences over 23.5k headlines.
Our results using large-scale language modeling to predict misinformation frames show that machine-generated inferences can influence readers' trust in news headlines.
arXiv Detail & Related papers (2021-04-18T09:50:11Z) - The Role of the Crowd in Countering Misinformation: A Case Study of the
COVID-19 Infodemic [15.885290526721544]
We focus on tweets related to the COVID-19 pandemic, analyzing the spread of misinformation, professional fact checks, and the crowd response to popular misleading claims about COVID-19.
We train a classifier to create a novel dataset of 155,468 COVID-19-related tweets, containing 33,237 false claims and 33,413 refuting arguments.
We observe that the surge in misinformation tweets results in a quick response and a corresponding increase in tweets that refute such misinformation.
arXiv Detail & Related papers (2020-11-11T13:48:44Z) - ArCOV19-Rumors: Arabic COVID-19 Twitter Dataset for Misinformation
Detection [6.688963029270579]
ArCOV19-Rumors is an Arabic COVID-19 Twitter dataset for misinformation detection composed of tweets containing claims from 27th January till the end of April 2020.
We collected 138 verified claims, mostly from popular fact-checking websites, and identified 9.4K relevant tweets to those claims.
Tweets were manually-annotated by veracity to support research on misinformation detection, which is one of the major problems faced during a pandemic.
arXiv Detail & Related papers (2020-10-17T11:21:40Z) - Understanding the Hoarding Behaviors during the COVID-19 Pandemic using
Large Scale Social Media Data [77.34726150561087]
We analyze the hoarding and anti-hoarding patterns of over 42,000 unique Twitter users in the United States from March 1 to April 30, 2020.
We find the percentage of females in both hoarding and anti-hoarding groups is higher than that of the general Twitter user population.
The LIWC anxiety mean for the hoarding-related tweets is significantly higher than the baseline Twitter anxiety mean.
arXiv Detail & Related papers (2020-10-15T16:02:25Z) - Misinformation Has High Perplexity [55.47422012881148]
We propose to leverage the perplexity to debunk false claims in an unsupervised manner.
First, we extract reliable evidence from scientific and news sources according to sentence similarity to the claims.
Second, we prime a language model with the extracted evidence and finally evaluate the correctness of given claims based on the perplexity scores at debunking time.
arXiv Detail & Related papers (2020-06-08T15:13:44Z) - An Exploratory Study of COVID-19 Misinformation on Twitter [5.070542698701158]
During the COVID-19 pandemic, social media has become a home ground for misinformation.
We have conducted an exploratory study into the propagation, authors and content of misinformation on Twitter around the topic of COVID-19.
arXiv Detail & Related papers (2020-05-12T12:07:35Z) - COVID-19 on Social Media: Analyzing Misinformation in Twitter
Conversations [22.43295864610142]
We collected streaming data related to COVID-19 using the Twitter API, starting March 1, 2020.
We identified unreliable and misleading contents based on fact-checking sources.
We examined the narratives promoted in misinformation tweets, along with the distribution of engagements with these tweets.
arXiv Detail & Related papers (2020-03-26T09:48:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.