Related papers: The User behind the Abuse: A Position on Ethics and Explainability

The User behind the Abuse: A Position on Ethics and Explainability

URL: http://arxiv.org/abs/2103.17191v1
Date: Wed, 31 Mar 2021 16:20:37 GMT
Title: The User behind the Abuse: A Position on Ethics and Explainability
Authors: Pushkar Mishra, Helen Yannakoudakis, Ekaterina Shutova
Abstract summary: We discuss the role that modeling of users and online communities plays in abuse detection. We then explore the ethical challenges of incorporating user and community information. We propose properties that an explainable method should aim to exhibit.
Score: 25.791014642037585
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Abuse on the Internet is an important societal problem of our time. Millions of Internet users face harassment, racism, personal attacks, and other types of abuse across various platforms. The psychological effects of abuse on individuals can be profound and lasting. Consequently, over the past few years, there has been a substantial research effort towards automated abusive language detection in the field of NLP. In this position paper, we discuss the role that modeling of users and online communities plays in abuse detection. Specifically, we review and analyze the state of the art methods that leverage user or community information to enhance the understanding and detection of abusive language. We then explore the ethical challenges of incorporating user and community information, laying out considerations to guide future research. Finally, we address the topic of explainability in abusive language detection, proposing properties that an explainable method should aim to exhibit. We describe how user and community information can facilitate the realization of these properties and discuss the effective operationalization of explainability in view of the properties.

Related papers

Towards a comprehensive taxonomy of online abusive language informed by machine leaning [0.0]
This paper presents a taxonomy for distinguishing key characteristics of abusive language within online text. It classifies various facets of online abuse, including context, target, intensity, directness, and theme of abuse.
arXiv Detail & Related papers (2025-04-24T15:23:47Z)
A survey of textual cyber abuse detection using cutting-edge language models and large language models [0.0]
We present a comprehensive analysis of the different forms of abuse prevalent in social media. We focus on how emerging technologies, such as Language Models (LMs) and Large Language Models (LLMs) are reshaping both the detection and generation of abusive content.
arXiv Detail & Related papers (2025-01-09T18:55:50Z)
MisinfoEval: Generative AI in the Era of "Alternative Facts" [50.069577397751175]
We introduce a framework for generating and evaluating large language model (LLM) based misinformation interventions. We present (1) an experiment with a simulated social media environment to measure effectiveness of misinformation interventions, and (2) a second experiment with personalized explanations tailored to the demographics and beliefs of users. Our findings confirm that LLM-based interventions are highly effective at correcting user behavior.
arXiv Detail & Related papers (2024-10-13T18:16:50Z)
A Survey of Stance Detection on Social Media: New Directions and Perspectives [50.27382951812502]
stance detection has emerged as a crucial subfield within affective computing. Recent years have seen a surge of research interest in developing effective stance detection methods. This paper provides a comprehensive survey of stance detection techniques on social media.
arXiv Detail & Related papers (2024-09-24T03:06:25Z)
The Unappreciated Role of Intent in Algorithmic Moderation of Social Media Content [2.2618341648062477]
This paper examines the role of intent in content moderation systems. We review state of the art detection models and benchmark training datasets for online abuse to assess their awareness and ability to capture intent.
arXiv Detail & Related papers (2024-05-17T18:05:13Z)
Countering Malicious Content Moderation Evasion in Online Social Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems. This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z)
Explainable Abuse Detection as Intent Classification and Slot Filling [66.80201541759409]
We introduce the concept of policy-aware abuse detection, abandoning the unrealistic expectation that systems can reliably learn which phenomena constitute abuse from inspecting the data alone. We show how architectures for intent classification and slot filling can be used for abuse detection, while providing a rationale for model decisions.
arXiv Detail & Related papers (2022-10-06T03:33:30Z)
Enriching Abusive Language Detection with Community Context [0.3708656266586145]
Use of pejorative expressions can be benign or actively empowering. Models for abuse detection misclassify these expressions as derogatory, inadvertently censor productive conversations held by marginalized groups. Our paper highlights how community context can improve classification outcomes in abusive language detection.
arXiv Detail & Related papers (2022-06-16T20:54:02Z)
Fragments of the Past: Curating Peer Support with Perpetrators of Domestic Violence [88.37416552778178]
We report on a ten-month study where we worked with six support workers and eighteen perpetrators in the design and deployment of Fragments of the Past. We share how crafting digitally-augmented artefacts - 'fragments' - of experiences of desisting from violence can translate messages for motivation and rapport between peers. These insights provide the basis for practical considerations for future network design with challenging populations.
arXiv Detail & Related papers (2021-07-09T22:57:43Z)
Confronting Abusive Language Online: A Survey from the Ethical and Human Rights Perspective [4.916009028580767]
We review a large body of NLP research on automatic abuse detection with a new focus on ethical challenges. We highlight the need to examine the broad social impacts of this technology. We identify several opportunities for rights-respecting, socio-technical solutions to detect and confront online abuse.
arXiv Detail & Related papers (2020-12-22T19:27:11Z)
AbuseAnalyzer: Abuse Detection, Severity and Target Prediction for Gab Posts [19.32095911241636]
We present a first of the kind dataset with 7601 posts from Gab which looks at online abuse from the perspective of presence of abuse, severity and target of abusive behavior. We also propose a system to address these tasks, obtaining an accuracy of 80% for abuse presence, 82% for abuse target prediction, and 65% for abuse severity prediction.
arXiv Detail & Related papers (2020-09-30T18:12:50Z)
Assessing the Severity of Health States based on Social Media Posts [62.52087340582502]
We propose a multiview learning framework that models both the textual content as well as contextual-information to assess the severity of the user's health state. The diverse NLU views demonstrate its effectiveness on both the tasks and as well as on the individual disease to assess a user's health.
arXiv Detail & Related papers (2020-09-21T03:45:14Z)
Joint Modelling of Emotion and Abusive Language Detection [26.18171134454037]
We present the first joint model of emotion and abusive language detection, experimenting in a multi-task learning framework. Our results demonstrate that incorporating affective features leads to significant improvements in abuse detection performance across datasets.
arXiv Detail & Related papers (2020-05-28T14:08:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.