The Unappreciated Role of Intent in Algorithmic Moderation of Social Media Content
- URL: http://arxiv.org/abs/2405.11030v1
- Date: Fri, 17 May 2024 18:05:13 GMT
- Title: The Unappreciated Role of Intent in Algorithmic Moderation of Social Media Content
- Authors: Xinyu Wang, Sai Koneru, Pranav Narayanan Venkit, Brett Frischmann, Sarah Rajtmajer,
- Abstract summary: This paper examines the role of intent in content moderation systems.
We review state of the art detection models and benchmark training datasets for online abuse to assess their awareness and ability to capture intent.
- Score: 2.2618341648062477
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As social media has become a predominant mode of communication globally, the rise of abusive content threatens to undermine civil discourse. Recognizing the critical nature of this issue, a significant body of research has been dedicated to developing language models that can detect various types of online abuse, e.g., hate speech, cyberbullying. However, there exists a notable disconnect between platform policies, which often consider the author's intention as a criterion for content moderation, and the current capabilities of detection models, which typically lack efforts to capture intent. This paper examines the role of intent in content moderation systems. We review state of the art detection models and benchmark training datasets for online abuse to assess their awareness and ability to capture intent. We propose strategic changes to the design and development of automated detection and moderation systems to improve alignment with ethical and policy conceptualizations of abuse.
Related papers
- Demarked: A Strategy for Enhanced Abusive Speech Moderation through Counterspeech, Detoxification, and Message Management [71.99446449877038]
We propose a more comprehensive approach called Demarcation scoring abusive speech based on four aspect -- (i) severity scale; (ii) presence of a target; (iii) context scale; (iv) legal scale.
Our work aims to inform future strategies for effectively addressing abusive speech online.
arXiv Detail & Related papers (2024-06-27T21:45:33Z) - Recent Advances in Hate Speech Moderation: Multimodality and the Role of Large Models [52.24001776263608]
This comprehensive survey delves into the recent strides in HS moderation.
We highlight the burgeoning role of large language models (LLMs) and large multimodal models (LMMs)
We identify existing gaps in research, particularly in the context of underrepresented languages and cultures.
arXiv Detail & Related papers (2024-01-30T03:51:44Z) - Can Language Model Moderators Improve the Health of Online Discourse? [26.191337231826246]
We establish a systematic definition of conversational moderation effectiveness grounded on moderation literature.
We propose a comprehensive evaluation framework to assess models' moderation capabilities independently of human intervention.
arXiv Detail & Related papers (2023-11-16T11:14:22Z) - Verifying the Robustness of Automatic Credibility Assessment [79.08422736721764]
Text classification methods have been widely investigated as a way to detect content of low credibility.
In some cases insignificant changes in input text can mislead the models.
We introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - Qualitative Analysis of a Graph Transformer Approach to Addressing Hate
Speech: Adapting to Dynamically Changing Content [8.393770595114763]
We offer a detailed qualitative analysis of this solution for hate speech detection in social networks.
A key insight is that the focus on reasoning about the concept of context positions us well to be able to support multi-modal analysis of online posts.
We conclude with a reflection on how the problem we are addressing relates especially well to the theme of dynamic change.
arXiv Detail & Related papers (2023-01-25T23:32:32Z) - Countering Malicious Content Moderation Evasion in Online Social
Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems.
This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z) - Explainable Abuse Detection as Intent Classification and Slot Filling [66.80201541759409]
We introduce the concept of policy-aware abuse detection, abandoning the unrealistic expectation that systems can reliably learn which phenomena constitute abuse from inspecting the data alone.
We show how architectures for intent classification and slot filling can be used for abuse detection, while providing a rationale for model decisions.
arXiv Detail & Related papers (2022-10-06T03:33:30Z) - Aggression and "hate speech" in communication of media users: analysis
of control capabilities [50.591267188664666]
Authors studied the possibilities of mutual influence of users in new media.
They found a high level of aggression and hate speech when discussing an urgent social problem - measures for COVID-19 fighting.
Results can be useful for developing media content in a modern digital environment.
arXiv Detail & Related papers (2022-08-25T15:53:32Z) - SoK: Content Moderation in Social Media, from Guidelines to Enforcement,
and Research to Practice [9.356143195807064]
We study the 14 most popular social media content moderation guidelines and practices in the US.
We identify the differences between the content moderation employed in mainstream social media platforms compared to fringe platforms.
We highlight why platforms should shift from a one-size-fits-all model to a more inclusive model.
arXiv Detail & Related papers (2022-06-29T18:48:04Z) - Towards Ethics by Design in Online Abusive Content Detection [7.163723138100273]
The research effort has spread out across several closely related sub-areas, such as detection of hate speech, toxicity, cyberbullying, etc.
We bring ethical issues to forefront and propose a unified framework as a two-step process.
The novel framework is guided by the Ethics by Design principle and is a step towards building more accurate and trusted models.
arXiv Detail & Related papers (2020-10-28T13:10:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.