Investigating the heterogenous effects of a massive content moderation intervention via Difference-in-Differences
- URL: http://arxiv.org/abs/2411.04037v3
- Date: Mon, 02 Dec 2024 12:51:48 GMT
- Title: Investigating the heterogenous effects of a massive content moderation intervention via Difference-in-Differences
- Authors: Lorenzo Cima, Benedetta Tessa, Stefano Cresci, Amaury Trujillo, Marco Avvenuti,
- Abstract summary: We apply a causal inference approach to shed light on the effectiveness of The Great Ban.
We analyze 53M comments shared by nearly 34K users.
- Score: 0.6918368994425961
- License:
- Abstract: In today's online environments, users encounter harm and abuse on a daily basis. Therefore, content moderation is crucial to ensure their safety and well-being. However, the effectiveness of many moderation interventions is still uncertain. Here, we apply a causal inference approach to shed light on the effectiveness of The Great Ban, a massive social media deplatforming intervention. We analyze 53M comments shared by nearly 34K users, providing in-depth results on both the intended and unintended consequences of the ban. Our causal analyses reveal that 15.6% of the moderated users abandoned the platform while the remaining ones decreased their overall toxicity by 4.1%. Nonetheless, a subset of those users increased their toxicity by 70% after the intervention. However, the increases in toxicity did not lead to marked increases in activity or engagement, meaning that the most toxic users had an overall limited impact. Our findings bring to light new insights on the effectiveness of deplatforming moderation interventions. Furthermore, they also contribute to informing future content moderation strategies.
Related papers
- MisinfoEval: Generative AI in the Era of "Alternative Facts" [50.069577397751175]
We introduce a framework for generating and evaluating large language model (LLM) based misinformation interventions.
We present (1) an experiment with a simulated social media environment to measure effectiveness of misinformation interventions, and (2) a second experiment with personalized explanations tailored to the demographics and beliefs of users.
Our findings confirm that LLM-based interventions are highly effective at correcting user behavior.
arXiv Detail & Related papers (2024-10-13T18:16:50Z) - Beyond Trial-and-Error: Predicting User Abandonment After a Moderation Intervention [0.6918368994425961]
We propose and tackle the novel task of predicting the effect of a moderation intervention on Reddit.
We use a dataset of 13.8M posts to compute a set of 142 features, which convey information about the activity, toxicity, relations, and writing style of the users.
Our results demonstrate the feasibility of predicting the effects of a moderation intervention, paving the way for a new research direction in predictive content moderation.
arXiv Detail & Related papers (2024-04-23T08:52:41Z) - The Great Ban: Efficacy and Unintended Consequences of a Massive Deplatforming Operation on Reddit [0.7422344184734279]
We assess the effectiveness of The Great Ban, a massive deplatforming operation that affected nearly 2,000 communities on Reddit.
By analyzing 16M comments posted by 17K users during 14 months, we provide nuanced results on the effects, both desired and otherwise.
arXiv Detail & Related papers (2024-01-20T15:21:37Z) - Unveiling the Implicit Toxicity in Large Language Models [77.90933074675543]
The open-endedness of large language models (LLMs) combined with their impressive capabilities may lead to new safety issues when being exploited for malicious use.
We show that LLMs can generate diverse implicit toxic outputs that are exceptionally difficult to detect via simply zero-shot prompting.
We propose a reinforcement learning (RL) based attacking method to further induce the implicit toxicity in LLMs.
arXiv Detail & Related papers (2023-11-29T06:42:36Z) - Decoding the Silent Majority: Inducing Belief Augmented Social Graph
with Large Language Model for Response Forecasting [74.68371461260946]
SocialSense is a framework that induces a belief-centered graph on top of an existent social network, along with graph-based propagation to capture social dynamics.
Our method surpasses existing state-of-the-art in experimental evaluations for both zero-shot and supervised settings.
arXiv Detail & Related papers (2023-10-20T06:17:02Z) - SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable
Responses Created Through Human-Machine Collaboration [75.62448812759968]
This dataset is a large-scale Korean dataset of 49k sensitive questions with 42k acceptable and 46k non-acceptable responses.
The dataset was constructed leveraging HyperCLOVA in a human-in-the-loop manner based on real news headlines.
arXiv Detail & Related papers (2023-05-28T11:51:20Z) - One of Many: Assessing User-level Effects of Moderation Interventions on
r/The_Donald [1.1041211464412573]
We evaluate the user level effects of the sequence of moderation interventions that targeted r/The_Donald on Reddit.
We find that interventions having strong community level effects also cause extreme and diversified user level reactions.
Our results highlight that platform and community level effects are not always representative of the underlying behavior of individuals or smaller user groups.
arXiv Detail & Related papers (2022-09-19T07:46:18Z) - Make Reddit Great Again: Assessing Community Effects of Moderation
Interventions on r/The_Donald [1.1041211464412573]
r/The_Donald was repeatedly denounced as a toxic and misbehaving online community.
It was quarantined in June 2019, restricted in February 2020, and finally banned in June 2020, but the effects of this sequence of interventions are still unclear.
We find that the interventions greatly reduced the activity of problematic users.
However, the interventions also caused an increase in toxicity and led users to share more polarized and less factual news.
arXiv Detail & Related papers (2022-01-17T15:09:51Z) - News consumption and social media regulations policy [70.31753171707005]
We analyze two social media that enforced opposite moderation methods, Twitter and Gab, to assess the interplay between news consumption and content regulation.
Our results show that the presence of moderation pursued by Twitter produces a significant reduction of questionable content.
The lack of clear regulation on Gab results in the tendency of the user to engage with both types of content, showing a slight preference for the questionable ones which may account for a dissing/endorsement behavior.
arXiv Detail & Related papers (2021-06-07T19:26:32Z) - Do Platform Migrations Compromise Content Moderation? Evidence from
r/The_Donald and r/Incels [20.41491269475746]
We report the results of a large-scale observational study of how problematic online communities progress following community-level moderation measures.
Our results suggest that, in both cases, moderation measures significantly decreased posting activity on the new platform.
In spite of that, users in one of the studied communities showed increases in signals associated with toxicity and radicalization.
arXiv Detail & Related papers (2020-10-20T16:03:06Z) - Information Consumption and Social Response in a Segregated Environment:
the Case of Gab [74.5095691235917]
This work provides a characterization of the interaction patterns within Gab around the COVID-19 topic.
We find that there are no strong statistical differences in the social response to questionable and reliable content.
Our results provide insights toward the understanding of coordinated inauthentic behavior and on the early-warning of information operation.
arXiv Detail & Related papers (2020-06-03T11:34:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.