Hate is the New Infodemic: A Topic-aware Modeling of Hate Speech
Diffusion on Twitter
- URL: http://arxiv.org/abs/2010.04377v1
- Date: Fri, 9 Oct 2020 05:43:08 GMT
- Title: Hate is the New Infodemic: A Topic-aware Modeling of Hate Speech
Diffusion on Twitter
- Authors: Sarah Masud, Subhabrata Dutta, Sakshi Makkar, Chhavi Jain, Vikram
Goyal, Amitava Das, Tanmoy Chakraborty
- Abstract summary: We focus on exploring user behaviour, which triggers the genesis of hate speech on Twitter.
We crawl a large-scale dataset of tweets, retweets, user activity history, and follower networks.
We characterize different signals of information that govern these dynamics.
- Score: 24.94135874070525
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online hate speech, particularly over microblogging platforms like Twitter,
has emerged as arguably the most severe issue of the past decade. Several
countries have reported a steep rise in hate crimes infuriated by malicious
hate campaigns. While the detection of hate speech is one of the emerging
research areas, the generation and spread of topic-dependent hate in the
information network remain under-explored. In this work, we focus on exploring
user behaviour, which triggers the genesis of hate speech on Twitter and how it
diffuses via retweets. We crawl a large-scale dataset of tweets, retweets, user
activity history, and follower networks, comprising over 161 million tweets
from more than $41$ million unique users. We also collect over 600k
contemporary news articles published online. We characterize different signals
of information that govern these dynamics. Our analyses differentiate the
diffusion dynamics in the presence of hate from usual information diffusion.
This motivates us to formulate the modelling problem in a topic-aware setting
with real-world knowledge. For predicting the initiation of hate speech for any
given hashtag, we propose multiple feature-rich models, with the best
performing one achieving a macro F1 score of 0.65. Meanwhile, to predict the
retweet dynamics on Twitter, we propose RETINA, a novel neural architecture
that incorporates exogenous influence using scaled dot-product attention.
RETINA achieves a macro F1-score of 0.85, outperforming multiple
state-of-the-art models. Our analysis reveals the superlative power of RETINA
to predict the retweet dynamics of hateful content compared to the existing
diffusion models.
Related papers
- ProvocationProbe: Instigating Hate Speech Dataset from Twitter [0.39052860539161904]
textitProvocationProbe is a dataset designed to explore what distinguishes instigating hate speech from general hate speech.
For this study, we collected around twenty thousand tweets from Twitter, encompassing a total of nine global controversies.
arXiv Detail & Related papers (2024-10-25T16:57:59Z) - Analyzing User Characteristics of Hate Speech Spreaders on Social Media [20.57872238271025]
We analyze the role of user characteristics in hate speech resharing across different types of hate speech.
We find that users with little social influence tend to share more hate speech.
Political anti-Trump and anti-right-wing hate is reshared by users with larger social influence.
arXiv Detail & Related papers (2023-10-24T12:17:48Z) - Revisiting Hate Speech Benchmarks: From Data Curation to System
Deployment [26.504056750529124]
We present GOTHate, a large-scale code-mixed crowdsourced dataset of around 51k posts for hate speech detection from Twitter.
We benchmark it with 10 recent baselines and investigate how adding endogenous signals enhances the hate speech detection task.
Our solution HEN-mBERT is a modular, multilingual, mixture-of-experts model that enriches the linguistic subspace with latent endogenous signals.
arXiv Detail & Related papers (2023-06-01T19:36:52Z) - Predicting Hate Intensity of Twitter Conversation Threads [26.190359413890537]
We propose DRAGNET++, which aims to predict the intensity of hatred that a tweet can bring in through its reply chain in the future.
It uses the semantic and propagating structure of the tweet threads to maximize the contextual information leading up to and the fall of hate intensity at each subsequent tweet.
We show that DRAGNET++ outperforms all the state-of-the-art baselines significantly.
arXiv Detail & Related papers (2022-06-16T18:51:36Z) - Manipulating Twitter Through Deletions [64.33261764633504]
Research into influence campaigns on Twitter has mostly relied on identifying malicious activities from tweets obtained via public APIs.
Here, we provide the first exhaustive, large-scale analysis of anomalous deletion patterns involving more than a billion deletions by over 11 million accounts.
We find that a small fraction of accounts delete a large number of tweets daily.
First, limits on tweet volume are circumvented, allowing certain accounts to flood the network with over 26 thousand daily tweets.
Second, coordinated networks of accounts engage in repetitive likes and unlikes of content that is eventually deleted, which can manipulate ranking algorithms.
arXiv Detail & Related papers (2022-03-25T20:07:08Z) - What goes on inside rumour and non-rumour tweets and their reactions: A
Psycholinguistic Analyses [58.75684238003408]
psycho-linguistics analyses of social media text are vital for drawing meaningful conclusions to mitigate misinformation.
This research contributes by performing an in-depth psycholinguistic analysis of rumours related to various kinds of events.
arXiv Detail & Related papers (2021-11-09T07:45:11Z) - Hate versus Politics: Detection of Hate against Policy makers in Italian
tweets [0.6289422225292998]
This paper addresses the issue of classification of hate speech against policy makers from Twitter in Italian.
We collected and annotated 1264 tweets, examined the cases of disagreements between annotators, and performed in-domain and cross-domain hate speech classifications.
We achieved a performance of ROC AUC 0.83 and analyzed the most predictive attributes, also finding the different language features in the anti-policymakers and anti-immigration domains.
arXiv Detail & Related papers (2021-07-12T12:24:45Z) - News consumption and social media regulations policy [70.31753171707005]
We analyze two social media that enforced opposite moderation methods, Twitter and Gab, to assess the interplay between news consumption and content regulation.
Our results show that the presence of moderation pursued by Twitter produces a significant reduction of questionable content.
The lack of clear regulation on Gab results in the tendency of the user to engage with both types of content, showing a slight preference for the questionable ones which may account for a dissing/endorsement behavior.
arXiv Detail & Related papers (2021-06-07T19:26:32Z) - Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media
during the COVID-19 Crisis [51.39895377836919]
COVID-19 has sparked racism and hate on social media targeted towards Asian communities.
We study the evolution and spread of anti-Asian hate speech through the lens of Twitter.
We create COVID-HATE, the largest dataset of anti-Asian hate and counterspeech spanning 14 months.
arXiv Detail & Related papers (2020-05-25T21:58:09Z) - Detecting Perceived Emotions in Hurricane Disasters [62.760131661847986]
We introduce HurricaneEmo, an emotion dataset of 15,000 English tweets spanning three hurricanes: Harvey, Irma, and Maria.
We present a comprehensive study of fine-grained emotions and propose classification tasks to discriminate between coarse-grained emotion groups.
arXiv Detail & Related papers (2020-04-29T16:17:49Z) - #MeToo on Campus: Studying College Sexual Assault at Scale Using Data
Reported on Social Media [71.74529365205053]
We analyze the influence of the # trend on a pool of college followers.
The results show that the majority of topics embedded in those # tweets detail sexual harassment stories.
There exists a significant correlation between the prevalence of this trend and official reports on several major geographical regions.
arXiv Detail & Related papers (2020-01-16T18:05:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.