Related papers: The Unappreciated Role of Intent in Algorithmic Moderation of Social Media Content

The Unappreciated Role of Intent in Algorithmic Moderation of Social Media Content

URL: http://arxiv.org/abs/2405.11030v1
Date: Fri, 17 May 2024 18:05:13 GMT
Title: The Unappreciated Role of Intent in Algorithmic Moderation of Social Media Content
Authors: Xinyu Wang, Sai Koneru, Pranav Narayanan Venkit, Brett Frischmann, Sarah Rajtmajer,
Abstract summary: This paper examines the role of intent in content moderation systems. We review state of the art detection models and benchmark training datasets for online abuse to assess their awareness and ability to capture intent.
Score: 2.2618341648062477
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As social media has become a predominant mode of communication globally, the rise of abusive content threatens to undermine civil discourse. Recognizing the critical nature of this issue, a significant body of research has been dedicated to developing language models that can detect various types of online abuse, e.g., hate speech, cyberbullying. However, there exists a notable disconnect between platform policies, which often consider the author's intention as a criterion for content moderation, and the current capabilities of detection models, which typically lack efforts to capture intent. This paper examines the role of intent in content moderation systems. We review state of the art detection models and benchmark training datasets for online abuse to assess their awareness and ability to capture intent. We propose strategic changes to the design and development of automated detection and moderation systems to improve alignment with ethical and policy conceptualizations of abuse.

Related papers

Towards a comprehensive taxonomy of online abusive language informed by machine leaning [0.0]
This paper presents a taxonomy for distinguishing key characteristics of abusive language within online text. It classifies various facets of online abuse, including context, target, intensity, directness, and theme of abuse.
arXiv Detail & Related papers (2025-04-24T15:23:47Z)
Policy-as-Prompt: Rethinking Content Moderation in the Age of Large Language Models [10.549072684871478]
This paper formalises the emerging policy-as-prompt framework and identifies five key challenges across four domains. It lays the groundwork for future exploration of scalable and adaptive content moderation systems in digital ecosystems.
arXiv Detail & Related papers (2025-02-25T23:15:16Z)
A survey of textual cyber abuse detection using cutting-edge language models and large language models [0.0]
We present a comprehensive analysis of the different forms of abuse prevalent in social media. We focus on how emerging technologies, such as Language Models (LMs) and Large Language Models (LLMs) are reshaping both the detection and generation of abusive content.
arXiv Detail & Related papers (2025-01-09T18:55:50Z)
Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice [186.055899073629]
Unlearning is often invoked as a solution for removing the effects of targeted information from a generative-AI model. Unlearning is also proposed as a way to prevent a model from generating targeted types of information in its outputs. Both of these goals--the targeted removal of information from a model and the targeted suppression of information from a model's outputs--present various technical and substantive challenges.
arXiv Detail & Related papers (2024-12-09T20:18:43Z)
A Survey of Stance Detection on Social Media: New Directions and Perspectives [50.27382951812502]
stance detection has emerged as a crucial subfield within affective computing. Recent years have seen a surge of research interest in developing effective stance detection methods. This paper provides a comprehensive survey of stance detection techniques on social media.
arXiv Detail & Related papers (2024-09-24T03:06:25Z)
Demarked: A Strategy for Enhanced Abusive Speech Moderation through Counterspeech, Detoxification, and Message Management [71.99446449877038]
We propose a more comprehensive approach called Demarcation scoring abusive speech based on four aspect -- (i) severity scale; (ii) presence of a target; (iii) context scale; (iv) legal scale. Our work aims to inform future strategies for effectively addressing abusive speech online.
arXiv Detail & Related papers (2024-06-27T21:45:33Z)
Recent Advances in Hate Speech Moderation: Multimodality and the Role of Large Models [52.24001776263608]
This comprehensive survey delves into the recent strides in HS moderation. We highlight the burgeoning role of large language models (LLMs) and large multimodal models (LMMs) We identify existing gaps in research, particularly in the context of underrepresented languages and cultures.
arXiv Detail & Related papers (2024-01-30T03:51:44Z)
Can Language Model Moderators Improve the Health of Online Discourse? [26.191337231826246]
We establish a systematic definition of conversational moderation effectiveness grounded on moderation literature. We propose a comprehensive evaluation framework to assess models' moderation capabilities independently of human intervention.
arXiv Detail & Related papers (2023-11-16T11:14:22Z)
Qualitative Analysis of a Graph Transformer Approach to Addressing Hate Speech: Adapting to Dynamically Changing Content [8.393770595114763]
We offer a detailed qualitative analysis of this solution for hate speech detection in social networks. A key insight is that the focus on reasoning about the concept of context positions us well to be able to support multi-modal analysis of online posts. We conclude with a reflection on how the problem we are addressing relates especially well to the theme of dynamic change.
arXiv Detail & Related papers (2023-01-25T23:32:32Z)
Countering Malicious Content Moderation Evasion in Online Social Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems. This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z)
Explainable Abuse Detection as Intent Classification and Slot Filling [66.80201541759409]
We introduce the concept of policy-aware abuse detection, abandoning the unrealistic expectation that systems can reliably learn which phenomena constitute abuse from inspecting the data alone. We show how architectures for intent classification and slot filling can be used for abuse detection, while providing a rationale for model decisions.
arXiv Detail & Related papers (2022-10-06T03:33:30Z)
Aggression and "hate speech" in communication of media users: analysis of control capabilities [50.591267188664666]
Authors studied the possibilities of mutual influence of users in new media. They found a high level of aggression and hate speech when discussing an urgent social problem - measures for COVID-19 fighting. Results can be useful for developing media content in a modern digital environment.
arXiv Detail & Related papers (2022-08-25T15:53:32Z)
SoK: Content Moderation in Social Media, from Guidelines to Enforcement, and Research to Practice [9.356143195807064]
We study the 14 most popular social media content moderation guidelines and practices in the US. We identify the differences between the content moderation employed in mainstream social media platforms compared to fringe platforms. We highlight why platforms should shift from a one-size-fits-all model to a more inclusive model.
arXiv Detail & Related papers (2022-06-29T18:48:04Z)
Towards Ethics by Design in Online Abusive Content Detection [7.163723138100273]
The research effort has spread out across several closely related sub-areas, such as detection of hate speech, toxicity, cyberbullying, etc. We bring ethical issues to forefront and propose a unified framework as a two-step process. The novel framework is guided by the Ethics by Design principle and is a step towards building more accurate and trusted models.
arXiv Detail & Related papers (2020-10-28T13:10:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.