Related papers: Automated Content Moderation Increases Adherence to Community Guidelines

Automated Content Moderation Increases Adherence to Community Guidelines

URL: http://arxiv.org/abs/2210.10454v3
Date: Thu, 16 Feb 2023 05:32:31 GMT
Title: Automated Content Moderation Increases Adherence to Community Guidelines
Authors: Manoel Horta Ribeiro, Justin Cheng, Robert West
Abstract summary: We used a fuzzy regression discontinuity design to measure the impact of automated content moderation on subsequent rule-breaking behavior. We found that comment deletion decreased subsequent rule-breaking behavior in shorter threads. Our results suggest that automated content moderation increases adherence to community guidelines.
Score: 16.69856781183336
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Online social media platforms use automated moderation systems to remove or reduce the visibility of rule-breaking content. While previous work has documented the importance of manual content moderation, the effects of automated content moderation remain largely unknown. Here, in a large study of Facebook comments (n=412M), we used a fuzzy regression discontinuity design to measure the impact of automated content moderation on subsequent rule-breaking behavior (number of comments hidden/deleted) and engagement (number of additional comments posted). We found that comment deletion decreased subsequent rule-breaking behavior in shorter threads (20 or fewer comments), even among other participants, suggesting that the intervention prevented conversations from derailing. Further, the effect of deletion on the affected user's subsequent rule-breaking behavior was longer-lived than its effect on reducing commenting in general, suggesting that users were deterred from rule-breaking but not from commenting. In contrast, hiding (rather than deleting) content had small and statistically insignificant effects. Our results suggest that automated content moderation increases adherence to community guidelines.

Related papers

Post Guidance for Online Communities [11.475519304748891]
Post guidance guides users' contributions using rules that trigger interventions as users draft a post to be submitted. This uniquely community-specific, proactive, and user-centric approach can increase adherence to rules without imposing additional burdens on moderators.
arXiv Detail & Related papers (2024-11-25T15:32:46Z)
Investigating the heterogenous effects of a massive content moderation intervention via Difference-in-Differences [0.6918368994425961]
We apply a causal inference approach to shed light on the effectiveness of The Great Ban on Reddit.<n>We analyze 53M comments shared by nearly 34K users, providing in-depth results on both the intended and unintended consequences.<n>Our findings bring to light new insights on the effectiveness of deplatforming moderation interventions.
arXiv Detail & Related papers (2024-11-06T16:34:59Z)
Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster [72.84926097773578]
We investigate the effect of explanations on the speed of real-world moderators. Our experiments show that while generic explanations do not affect their speed and are often ignored, structured explanations lower moderators' decision making time by 7.4%.
arXiv Detail & Related papers (2024-06-06T14:23:10Z)
Text Attribute Control via Closed-Loop Disentanglement [72.2786244367634]
We propose a novel approach to achieve a robust control of attributes while enhancing content preservation. In this paper, we use a semi-supervised contrastive learning method to encourage the disentanglement of attributes in latent spaces. We conducted experiments on three text datasets, including the Yelp Service review dataset, the Amazon Product review dataset, and the GoEmotions dataset.
arXiv Detail & Related papers (2023-12-01T01:26:38Z)
Why Should This Article Be Deleted? Transparent Stance Detection in Multilingual Wikipedia Editor Discussions [47.944081120226905]
We construct a novel dataset of Wikipedia editor discussions along with their reasoning in three languages. The dataset contains the stances of the editors (keep, delete, merge, comment), along with the stated reason, and a content moderation policy, for each edit decision. We demonstrate that stance and corresponding reason (policy) can be predicted jointly with a high degree of accuracy, adding transparency to the decision-making process.
arXiv Detail & Related papers (2023-10-09T15:11:02Z)
Analyzing Norm Violations in Live-Stream Chat [49.120561596550395]
We study the first NLP study dedicated to detecting norm violations in conversations on live-streaming platforms. We define norm violation categories in live-stream chats and annotate 4,583 moderated comments from Twitch. Our results show that appropriate contextual information can boost moderation performance by 35%.
arXiv Detail & Related papers (2023-05-18T05:58:27Z)
One of Many: Assessing User-level Effects of Moderation Interventions on r/The_Donald [1.1041211464412573]
We evaluate the user level effects of the sequence of moderation interventions that targeted r/The_Donald on Reddit. We find that interventions having strong community level effects also cause extreme and diversified user level reactions. Our results highlight that platform and community level effects are not always representative of the underlying behavior of individuals or smaller user groups.
arXiv Detail & Related papers (2022-09-19T07:46:18Z)
Manipulating Twitter Through Deletions [64.33261764633504]
Research into influence campaigns on Twitter has mostly relied on identifying malicious activities from tweets obtained via public APIs. Here, we provide the first exhaustive, large-scale analysis of anomalous deletion patterns involving more than a billion deletions by over 11 million accounts. We find that a small fraction of accounts delete a large number of tweets daily. First, limits on tweet volume are circumvented, allowing certain accounts to flood the network with over 26 thousand daily tweets. Second, coordinated networks of accounts engage in repetitive likes and unlikes of content that is eventually deleted, which can manipulate ranking algorithms.
arXiv Detail & Related papers (2022-03-25T20:07:08Z)
Make Reddit Great Again: Assessing Community Effects of Moderation Interventions on r/The_Donald [1.1041211464412573]
r/The_Donald was repeatedly denounced as a toxic and misbehaving online community. It was quarantined in June 2019, restricted in February 2020, and finally banned in June 2020, but the effects of this sequence of interventions are still unclear. We find that the interventions greatly reduced the activity of problematic users. However, the interventions also caused an increase in toxicity and led users to share more polarized and less factual news.
arXiv Detail & Related papers (2022-01-17T15:09:51Z)
News consumption and social media regulations policy [70.31753171707005]
We analyze two social media that enforced opposite moderation methods, Twitter and Gab, to assess the interplay between news consumption and content regulation. Our results show that the presence of moderation pursued by Twitter produces a significant reduction of questionable content. The lack of clear regulation on Gab results in the tendency of the user to engage with both types of content, showing a slight preference for the questionable ones which may account for a dissing/endorsement behavior.
arXiv Detail & Related papers (2021-06-07T19:26:32Z)
Do Platform Migrations Compromise Content Moderation? Evidence from r/The_Donald and r/Incels [20.41491269475746]
We report the results of a large-scale observational study of how problematic online communities progress following community-level moderation measures. Our results suggest that, in both cases, moderation measures significantly decreased posting activity on the new platform. In spite of that, users in one of the studied communities showed increases in signals associated with toxicity and radicalization.
arXiv Detail & Related papers (2020-10-20T16:03:06Z)
Effects of algorithmic flagging on fairness: quasi-experimental evidence from Wikipedia [9.885409727425433]
We analyze moderator behavior in Wikipedia as mediated by RCFilters, a system which displays social signals and algorithmic flags. We show that algorithmically flagged edits are reverted more often, especially those by established editors with positive social signals. Our results suggest that algorithmic flagging systems can lead to increased fairness in some contexts but that the relationship is complex and contingent.
arXiv Detail & Related papers (2020-06-04T20:25:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.