A deep dive into the consistently toxic 1% of Twitter
- URL: http://arxiv.org/abs/2202.07853v1
- Date: Wed, 16 Feb 2022 04:21:48 GMT
- Title: A deep dive into the consistently toxic 1% of Twitter
- Authors: Hina Qayyum, Benjamin Zi Hao Zhao, Ian D. Wood, Muhammad Ikram,
Mohamed Ali Kaafar, Nicolas Kourtellis
- Abstract summary: This study spans 14 years of tweets from 122K Twitter profiles and more than 293M tweets.
We selected the most extreme profiles in terms of consistency of toxic content and examined their tweet texts, and the domains, hashtags, and URLs they shared.
We found that these selected profiles keep to a narrow theme with lower diversity in hashtags, URLs, and domains, they are thematically similar to each other, and have a high likelihood of bot-like behavior.
- Score: 9.669275987983447
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Misbehavior in online social networks (OSN) is an ever-growing phenomenon.
The research to date tends to focus on the deployment of machine learning to
identify and classify types of misbehavior such as bullying, aggression, and
racism to name a few. The main goal of identification is to curb natural and
mechanical misconduct and make OSNs a safer place for social discourse. Going
beyond past works, we perform a longitudinal study of a large selection of
Twitter profiles, which enables us to characterize profiles in terms of how
consistently they post highly toxic content. Our data spans 14 years of tweets
from 122K Twitter profiles and more than 293M tweets. From this data, we
selected the most extreme profiles in terms of consistency of toxic content and
examined their tweet texts, and the domains, hashtags, and URLs they shared. We
found that these selected profiles keep to a narrow theme with lower diversity
in hashtags, URLs, and domains, they are thematically similar to each other (in
a coordinated manner, if not through intent), and have a high likelihood of
bot-like behavior (likely to have progenitors with intentions to influence).
Our work contributes a substantial and longitudinal online misbehavior dataset
to the research community and establishes the consistency of a profile's toxic
behavior as a useful factor when exploring misbehavior as potential accessories
to influence operations on OSNs.
Related papers
- On mission Twitter Profiles: A Study of Selective Toxic Behavior [5.0157204307764625]
This study aims to characterize profiles potentially used for influence operations, termed 'on-mission profiles'
Longitudinal data from 138K Twitter or X, profiles and 293M tweets enables profiling based on theme diversity.
arXiv Detail & Related papers (2024-01-25T15:42:36Z) - Understanding writing style in social media with a supervised
contrastively pre-trained transformer [57.48690310135374]
Online Social Networks serve as fertile ground for harmful behavior, ranging from hate speech to the dissemination of disinformation.
We introduce the Style Transformer for Authorship Representations (STAR), trained on a large corpus derived from public sources of 4.5 x 106 authored texts.
Using a support base of 8 documents of 512 tokens, we can discern authors from sets of up to 1616 authors with at least 80% accuracy.
arXiv Detail & Related papers (2023-10-17T09:01:17Z) - A longitudinal study of the top 1% toxic Twitter profiles [9.669275987983447]
We study 143K Twitter profiles and focus on the behavior of the top 1 percent producers of toxic content on Twitter.
With a total of 293M tweets, spanning 16 years of activity, the longitudinal data allow us to reconstruct the timelines of all profiles involved.
We find that the highly toxic profiles post coherent and well articulated content, their tweets keep to a narrow theme with lower diversity in hashtags, URLs, and domains.
arXiv Detail & Related papers (2023-03-26T01:55:28Z) - Manipulating Twitter Through Deletions [64.33261764633504]
Research into influence campaigns on Twitter has mostly relied on identifying malicious activities from tweets obtained via public APIs.
Here, we provide the first exhaustive, large-scale analysis of anomalous deletion patterns involving more than a billion deletions by over 11 million accounts.
We find that a small fraction of accounts delete a large number of tweets daily.
First, limits on tweet volume are circumvented, allowing certain accounts to flood the network with over 26 thousand daily tweets.
Second, coordinated networks of accounts engage in repetitive likes and unlikes of content that is eventually deleted, which can manipulate ranking algorithms.
arXiv Detail & Related papers (2022-03-25T20:07:08Z) - Identification of Twitter Bots based on an Explainable ML Framework: the
US 2020 Elections Case Study [72.61531092316092]
This paper focuses on the design of a novel system for identifying Twitter bots based on labeled Twitter data.
Supervised machine learning (ML) framework is adopted using an Extreme Gradient Boosting (XGBoost) algorithm.
Our study also deploys Shapley Additive Explanations (SHAP) for explaining the ML model predictions.
arXiv Detail & Related papers (2021-12-08T14:12:24Z) - News consumption and social media regulations policy [70.31753171707005]
We analyze two social media that enforced opposite moderation methods, Twitter and Gab, to assess the interplay between news consumption and content regulation.
Our results show that the presence of moderation pursued by Twitter produces a significant reduction of questionable content.
The lack of clear regulation on Gab results in the tendency of the user to engage with both types of content, showing a slight preference for the questionable ones which may account for a dissing/endorsement behavior.
arXiv Detail & Related papers (2021-06-07T19:26:32Z) - High-level Approaches to Detect Malicious Political Activity on Twitter [0.0]
We investigate a data snapshot taken on May 2020, with around 5 million accounts and over 120 million tweets.
The analyzed time period stretches from August 2019 to May 2020, with a focus on the Portuguese elections of October 6th, 2019.
We learn that Twitter's suspension patterns are not adequate to the type of political trolling found in the Portuguese Twittersphere.
arXiv Detail & Related papers (2021-02-04T22:54:44Z) - Misleading Repurposing on Twitter [3.0254442724635173]
We present the first in-depth and large-scale study of misleading repurposing.
A malicious user changes the identity of their social media account via, among other things, changes to the profile attributes in order to use the account for a new purpose while retaining their followers.
We propose a definition for the behavior and a methodology that uses supervised learning on data mined from the Internet Archive's Twitter Stream Grab to flag repurposed accounts.
arXiv Detail & Related papers (2020-10-20T20:19:01Z) - ALONE: A Dataset for Toxic Behavior among Adolescents on Twitter [5.723363140737726]
This paper provides a dataset of toxic social media interactions between confirmed high school students, called ALONE (AdoLescents ON twittEr)
Nearly 66% of internet users have observed online harassment, and 41% claim personal experience, with 18% facing severe forms of online harassment.
Our observations show that individual tweets do not provide sufficient evidence for toxic behavior, and meaningful use of context in interactions can enable highlighting or exonerating tweets with purported toxicity.
arXiv Detail & Related papers (2020-08-14T17:02:55Z) - Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media
during the COVID-19 Crisis [51.39895377836919]
COVID-19 has sparked racism and hate on social media targeted towards Asian communities.
We study the evolution and spread of anti-Asian hate speech through the lens of Twitter.
We create COVID-HATE, the largest dataset of anti-Asian hate and counterspeech spanning 14 months.
arXiv Detail & Related papers (2020-05-25T21:58:09Z) - Privacy-Aware Recommender Systems Challenge on Twitter's Home Timeline [47.434392695347924]
RecSys 2020 Challenge organized by ACM RecSys in partnership with Twitter using this dataset.
This paper touches on the key challenges faced by researchers and professionals striving to predict user engagements.
arXiv Detail & Related papers (2020-04-28T23:54:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.