Understanding Information Spreading Mechanisms During COVID-19 Pandemic
by Analyzing the Impact of Tweet Text and User Features for Retweet
Prediction
- URL: http://arxiv.org/abs/2106.07344v1
- Date: Wed, 26 May 2021 15:55:58 GMT
- Title: Understanding Information Spreading Mechanisms During COVID-19 Pandemic
by Analyzing the Impact of Tweet Text and User Features for Retweet
Prediction
- Authors: Pervaiz Iqbal Khan, Imran Razzak, Andreas Dengel, Sheraz Ahmed
- Abstract summary: COVID-19 has affected the world economy and the daily life routine of almost everyone.
Social media platforms enable users to share information with other users who can reshare this information.
We propose two CNN and RNN based models and evaluate the performance of these models on a publicly available TweetsCOV19 dataset.
- Score: 6.658785818853953
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: COVID-19 has affected the world economy and the daily life routine of almost
everyone. It has been a hot topic on social media platforms such as Twitter,
Facebook, etc. These social media platforms enable users to share information
with other users who can reshare this information, thus causing this
information to spread. Twitter's retweet functionality allows users to share
the existing content with other users without altering the original content.
Analysis of social media platforms can help in detecting emergencies during
pandemics that lead to taking preventive measures. One such type of analysis is
predicting the number of retweets for a given COVID-19 related tweet. Recently,
CIKM organized a retweet prediction challenge for COVID-19 tweets focusing on
using numeric features only. However, our hypothesis is, tweet text may play a
vital role in an accurate retweet prediction. In this paper, we combine numeric
and text features for COVID-19 related retweet predictions. For this purpose,
we propose two CNN and RNN based models and evaluate the performance of these
models on a publicly available TweetsCOV19 dataset using seven different
evaluation metrics. Our evaluation results show that combining tweet text with
numeric features improves the performance of retweet prediction significantly.
Related papers
- Decoding the Silent Majority: Inducing Belief Augmented Social Graph
with Large Language Model for Response Forecasting [74.68371461260946]
SocialSense is a framework that induces a belief-centered graph on top of an existent social network, along with graph-based propagation to capture social dynamics.
Our method surpasses existing state-of-the-art in experimental evaluations for both zero-shot and supervised settings.
arXiv Detail & Related papers (2023-10-20T06:17:02Z) - Context-Based Tweet Engagement Prediction [0.0]
This thesis investigates how well context alone may be used to predict tweet engagement likelihood.
We employed the Spark engine on TU Wien's Little Big Data Cluster to create scalable data preprocessing, feature engineering, feature selection, and machine learning pipelines.
We also found that factors such as the prediction algorithm, training dataset size, training dataset sampling method, and feature selection significantly affect the results.
arXiv Detail & Related papers (2023-09-28T08:36:57Z) - ManiTweet: A New Benchmark for Identifying Manipulation of News on Social Media [74.93847489218008]
We present a novel task, identifying manipulation of news on social media, which aims to detect manipulation in social media posts and identify manipulated or inserted information.
To study this task, we have proposed a data collection schema and curated a dataset called ManiTweet, consisting of 3.6K pairs of tweets and corresponding articles.
Our analysis demonstrates that this task is highly challenging, with large language models (LLMs) yielding unsatisfactory performance.
arXiv Detail & Related papers (2023-05-23T16:40:07Z) - Retweet-BERT: Political Leaning Detection Using Language Features and
Information Diffusion on Social Networks [30.143148646797265]
We introduce Retweet-BERT, a simple and scalable model to estimate the political leanings of Twitter users.
Our assumptions stem from patterns of networks and linguistics homophily among people who share similar ideologies.
arXiv Detail & Related papers (2022-07-18T02:18:20Z) - ViralBERT: A User Focused BERT-Based Approach to Virality Prediction [11.992815669875924]
We propose ViralBERT, which can be used to predict the virality of tweets using content- and user-based features.
We employ a method of concatenating numerical features such as hashtags and follower numbers to tweet text, and utilise two BERT modules.
We collect a dataset of 330k tweets to train ViralBERT and validate the efficacy of our model using baselines from current studies in this field.
arXiv Detail & Related papers (2022-05-17T21:40:24Z) - Manipulating Twitter Through Deletions [64.33261764633504]
Research into influence campaigns on Twitter has mostly relied on identifying malicious activities from tweets obtained via public APIs.
Here, we provide the first exhaustive, large-scale analysis of anomalous deletion patterns involving more than a billion deletions by over 11 million accounts.
We find that a small fraction of accounts delete a large number of tweets daily.
First, limits on tweet volume are circumvented, allowing certain accounts to flood the network with over 26 thousand daily tweets.
Second, coordinated networks of accounts engage in repetitive likes and unlikes of content that is eventually deleted, which can manipulate ranking algorithms.
arXiv Detail & Related papers (2022-03-25T20:07:08Z) - Identification of Twitter Bots based on an Explainable ML Framework: the
US 2020 Elections Case Study [72.61531092316092]
This paper focuses on the design of a novel system for identifying Twitter bots based on labeled Twitter data.
Supervised machine learning (ML) framework is adopted using an Extreme Gradient Boosting (XGBoost) algorithm.
Our study also deploys Shapley Additive Explanations (SHAP) for explaining the ML model predictions.
arXiv Detail & Related papers (2021-12-08T14:12:24Z) - News consumption and social media regulations policy [70.31753171707005]
We analyze two social media that enforced opposite moderation methods, Twitter and Gab, to assess the interplay between news consumption and content regulation.
Our results show that the presence of moderation pursued by Twitter produces a significant reduction of questionable content.
The lack of clear regulation on Gab results in the tendency of the user to engage with both types of content, showing a slight preference for the questionable ones which may account for a dissing/endorsement behavior.
arXiv Detail & Related papers (2021-06-07T19:26:32Z) - CML-COVID: A Large-Scale COVID-19 Twitter Dataset with Latent Topics,
Sentiment and Location Information [0.0]
CML-COVID is a COVID-19 Twitter data set of 19,298,967 million tweets from 5,977,653 unique individuals.
These tweets were collected between March 2020 and July 2020 using the query terms coronavirus, covid and mask related to COVID-19.
arXiv Detail & Related papers (2021-01-28T18:59:10Z) - Exploratory Analysis of Covid-19 Tweets using Topic Modeling, UMAP, and
DiGraphs [36.33347149799959]
This paper illustrates five different techniques to assess the distinctiveness of topics, key terms and features, speed of information dissemination, and network behaviors for Covid19 tweets.
One topic specific to U.S. cases would start to uptick immediately after live White House Coronavirus Task Force briefings.
One of the simplest highlights of this analysis is that early-stage descriptive methods like regular expressions can successfully identify high-level themes.
arXiv Detail & Related papers (2020-05-06T19:16:38Z) - Privacy-Aware Recommender Systems Challenge on Twitter's Home Timeline [47.434392695347924]
RecSys 2020 Challenge organized by ACM RecSys in partnership with Twitter using this dataset.
This paper touches on the key challenges faced by researchers and professionals striving to predict user engagements.
arXiv Detail & Related papers (2020-04-28T23:54:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.