A longitudinal study of the top 1% toxic Twitter profiles
- URL: http://arxiv.org/abs/2303.14603v1
- Date: Sun, 26 Mar 2023 01:55:28 GMT
- Title: A longitudinal study of the top 1% toxic Twitter profiles
- Authors: Hina Qayyum, Benjamin Zi Hao Zhao, Ian D. Wood, Muhammad Ikram,
Mohamed Ali Kaafar, Nicolas Kourtellis
- Abstract summary: We study 143K Twitter profiles and focus on the behavior of the top 1 percent producers of toxic content on Twitter.
With a total of 293M tweets, spanning 16 years of activity, the longitudinal data allow us to reconstruct the timelines of all profiles involved.
We find that the highly toxic profiles post coherent and well articulated content, their tweets keep to a narrow theme with lower diversity in hashtags, URLs, and domains.
- Score: 9.669275987983447
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Toxicity is endemic to online social networks including Twitter. It follows a
Pareto like distribution where most of the toxicity is generated by a very
small number of profiles and as such, analyzing and characterizing these toxic
profiles is critical. Prior research has largely focused on sporadic, event
centric toxic content to characterize toxicity on the platform. Instead, we
approach the problem of characterizing toxic content from a profile centric
point of view. We study 143K Twitter profiles and focus on the behavior of the
top 1 percent producers of toxic content on Twitter, based on toxicity scores
of their tweets availed by Perspective API. With a total of 293M tweets,
spanning 16 years of activity, the longitudinal data allow us to reconstruct
the timelines of all profiles involved. We use these timelines to gauge the
behavior of the most toxic Twitter profiles compared to the rest of the Twitter
population. We study the pattern of tweet posting from highly toxic accounts,
based on the frequency and how prolific they are, the nature of hashtags and
URLs, profile metadata, and Botometer scores. We find that the highly toxic
profiles post coherent and well articulated content, their tweets keep to a
narrow theme with lower diversity in hashtags, URLs, and domains, they are
thematically similar to each other, and have a high likelihood of bot like
behavior, likely to have progenitors with intentions to influence, based on
high fake followers score. Our work contributes insight into the top 1 percent
of toxic profiles on Twitter and establishes the profile centric approach to
investigate toxicity on Twitter to be beneficial.
Related papers
- Twits, Toxic Tweets, and Tribal Tendencies: Trends in Politically Polarized Posts on Twitter [5.161088104035108]
We explore the role that partisanship and affective polarization play in contributing to toxicity on an individual level and a topic level on Twitter/X.
After collecting 89.6 million tweets from 43,151 Twitter/X users, we determine how several account-level characteristics, including partisanship, predict how often users post toxic content.
arXiv Detail & Related papers (2023-07-19T17:24:47Z) - Understanding the Bystander Effect on Toxic Twitter Conversations [1.1339580074756188]
We examine whether the toxicity of the first direct reply to a toxic tweet in conversations establishes the group norms for subsequent replies.
We analyze a random sample of more than 156k tweets belonging to 9k conversations.
arXiv Detail & Related papers (2022-11-19T18:31:39Z) - User Engagement and the Toxicity of Tweets [1.1339580074756188]
We analyze a random sample of more than 85,300 Twitter conversations to examine differences between toxic and non-toxic conversations.
We find that toxic conversations, those with at least one toxic tweet, are longer but have fewer individual users contributing to the dialogue compared to the non-toxic conversations.
We also find a relationship between the toxicity of the first reply to a toxic tweet and the toxicity of the conversation.
arXiv Detail & Related papers (2022-11-07T20:55:22Z) - Manipulating Twitter Through Deletions [64.33261764633504]
Research into influence campaigns on Twitter has mostly relied on identifying malicious activities from tweets obtained via public APIs.
Here, we provide the first exhaustive, large-scale analysis of anomalous deletion patterns involving more than a billion deletions by over 11 million accounts.
We find that a small fraction of accounts delete a large number of tweets daily.
First, limits on tweet volume are circumvented, allowing certain accounts to flood the network with over 26 thousand daily tweets.
Second, coordinated networks of accounts engage in repetitive likes and unlikes of content that is eventually deleted, which can manipulate ranking algorithms.
arXiv Detail & Related papers (2022-03-25T20:07:08Z) - A deep dive into the consistently toxic 1% of Twitter [9.669275987983447]
This study spans 14 years of tweets from 122K Twitter profiles and more than 293M tweets.
We selected the most extreme profiles in terms of consistency of toxic content and examined their tweet texts, and the domains, hashtags, and URLs they shared.
We found that these selected profiles keep to a narrow theme with lower diversity in hashtags, URLs, and domains, they are thematically similar to each other, and have a high likelihood of bot-like behavior.
arXiv Detail & Related papers (2022-02-16T04:21:48Z) - Identification of Twitter Bots based on an Explainable ML Framework: the
US 2020 Elections Case Study [72.61531092316092]
This paper focuses on the design of a novel system for identifying Twitter bots based on labeled Twitter data.
Supervised machine learning (ML) framework is adopted using an Extreme Gradient Boosting (XGBoost) algorithm.
Our study also deploys Shapley Additive Explanations (SHAP) for explaining the ML model predictions.
arXiv Detail & Related papers (2021-12-08T14:12:24Z) - Annotators with Attitudes: How Annotator Beliefs And Identities Bias
Toxic Language Detection [75.54119209776894]
We investigate the effect of annotator identities (who) and beliefs (why) on toxic language annotations.
We consider posts with three characteristics: anti-Black language, African American English dialect, and vulgarity.
Our results show strong associations between annotator identity and beliefs and their ratings of toxicity.
arXiv Detail & Related papers (2021-11-15T18:58:20Z) - News consumption and social media regulations policy [70.31753171707005]
We analyze two social media that enforced opposite moderation methods, Twitter and Gab, to assess the interplay between news consumption and content regulation.
Our results show that the presence of moderation pursued by Twitter produces a significant reduction of questionable content.
The lack of clear regulation on Gab results in the tendency of the user to engage with both types of content, showing a slight preference for the questionable ones which may account for a dissing/endorsement behavior.
arXiv Detail & Related papers (2021-06-07T19:26:32Z) - Understanding the Hoarding Behaviors during the COVID-19 Pandemic using
Large Scale Social Media Data [77.34726150561087]
We analyze the hoarding and anti-hoarding patterns of over 42,000 unique Twitter users in the United States from March 1 to April 30, 2020.
We find the percentage of females in both hoarding and anti-hoarding groups is higher than that of the general Twitter user population.
The LIWC anxiety mean for the hoarding-related tweets is significantly higher than the baseline Twitter anxiety mean.
arXiv Detail & Related papers (2020-10-15T16:02:25Z) - ALONE: A Dataset for Toxic Behavior among Adolescents on Twitter [5.723363140737726]
This paper provides a dataset of toxic social media interactions between confirmed high school students, called ALONE (AdoLescents ON twittEr)
Nearly 66% of internet users have observed online harassment, and 41% claim personal experience, with 18% facing severe forms of online harassment.
Our observations show that individual tweets do not provide sufficient evidence for toxic behavior, and meaningful use of context in interactions can enable highlighting or exonerating tweets with purported toxicity.
arXiv Detail & Related papers (2020-08-14T17:02:55Z) - Privacy-Aware Recommender Systems Challenge on Twitter's Home Timeline [47.434392695347924]
RecSys 2020 Challenge organized by ACM RecSys in partnership with Twitter using this dataset.
This paper touches on the key challenges faced by researchers and professionals striving to predict user engagements.
arXiv Detail & Related papers (2020-04-28T23:54:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.