The User behind the Abuse: A Position on Ethics and Explainability
- URL: http://arxiv.org/abs/2103.17191v1
- Date: Wed, 31 Mar 2021 16:20:37 GMT
- Title: The User behind the Abuse: A Position on Ethics and Explainability
- Authors: Pushkar Mishra, Helen Yannakoudakis, Ekaterina Shutova
- Abstract summary: We discuss the role that modeling of users and online communities plays in abuse detection.
We then explore the ethical challenges of incorporating user and community information.
We propose properties that an explainable method should aim to exhibit.
- Score: 25.791014642037585
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Abuse on the Internet is an important societal problem of our time. Millions
of Internet users face harassment, racism, personal attacks, and other types of
abuse across various platforms. The psychological effects of abuse on
individuals can be profound and lasting. Consequently, over the past few years,
there has been a substantial research effort towards automated abusive language
detection in the field of NLP. In this position paper, we discuss the role that
modeling of users and online communities plays in abuse detection.
Specifically, we review and analyze the state of the art methods that leverage
user or community information to enhance the understanding and detection of
abusive language. We then explore the ethical challenges of incorporating user
and community information, laying out considerations to guide future research.
Finally, we address the topic of explainability in abusive language detection,
proposing properties that an explainable method should aim to exhibit. We
describe how user and community information can facilitate the realization of
these properties and discuss the effective operationalization of explainability
in view of the properties.
Related papers
- MisinfoEval: Generative AI in the Era of "Alternative Facts" [50.069577397751175]
We introduce a framework for generating and evaluating large language model (LLM) based misinformation interventions.
We present (1) an experiment with a simulated social media environment to measure effectiveness of misinformation interventions, and (2) a second experiment with personalized explanations tailored to the demographics and beliefs of users.
Our findings confirm that LLM-based interventions are highly effective at correcting user behavior.
arXiv Detail & Related papers (2024-10-13T18:16:50Z) - The Unappreciated Role of Intent in Algorithmic Moderation of Social Media Content [2.2618341648062477]
This paper examines the role of intent in content moderation systems.
We review state of the art detection models and benchmark training datasets for online abuse to assess their awareness and ability to capture intent.
arXiv Detail & Related papers (2024-05-17T18:05:13Z) - Countering Malicious Content Moderation Evasion in Online Social
Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems.
This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z) - Explainable Abuse Detection as Intent Classification and Slot Filling [66.80201541759409]
We introduce the concept of policy-aware abuse detection, abandoning the unrealistic expectation that systems can reliably learn which phenomena constitute abuse from inspecting the data alone.
We show how architectures for intent classification and slot filling can be used for abuse detection, while providing a rationale for model decisions.
arXiv Detail & Related papers (2022-10-06T03:33:30Z) - Enriching Abusive Language Detection with Community Context [0.3708656266586145]
Use of pejorative expressions can be benign or actively empowering.
Models for abuse detection misclassify these expressions as derogatory, inadvertently censor productive conversations held by marginalized groups.
Our paper highlights how community context can improve classification outcomes in abusive language detection.
arXiv Detail & Related papers (2022-06-16T20:54:02Z) - Fragments of the Past: Curating Peer Support with Perpetrators of
Domestic Violence [88.37416552778178]
We report on a ten-month study where we worked with six support workers and eighteen perpetrators in the design and deployment of Fragments of the Past.
We share how crafting digitally-augmented artefacts - 'fragments' - of experiences of desisting from violence can translate messages for motivation and rapport between peers.
These insights provide the basis for practical considerations for future network design with challenging populations.
arXiv Detail & Related papers (2021-07-09T22:57:43Z) - Confronting Abusive Language Online: A Survey from the Ethical and Human
Rights Perspective [4.916009028580767]
We review a large body of NLP research on automatic abuse detection with a new focus on ethical challenges.
We highlight the need to examine the broad social impacts of this technology.
We identify several opportunities for rights-respecting, socio-technical solutions to detect and confront online abuse.
arXiv Detail & Related papers (2020-12-22T19:27:11Z) - AbuseAnalyzer: Abuse Detection, Severity and Target Prediction for Gab
Posts [19.32095911241636]
We present a first of the kind dataset with 7601 posts from Gab which looks at online abuse from the perspective of presence of abuse, severity and target of abusive behavior.
We also propose a system to address these tasks, obtaining an accuracy of 80% for abuse presence, 82% for abuse target prediction, and 65% for abuse severity prediction.
arXiv Detail & Related papers (2020-09-30T18:12:50Z) - Assessing the Severity of Health States based on Social Media Posts [62.52087340582502]
We propose a multiview learning framework that models both the textual content as well as contextual-information to assess the severity of the user's health state.
The diverse NLU views demonstrate its effectiveness on both the tasks and as well as on the individual disease to assess a user's health.
arXiv Detail & Related papers (2020-09-21T03:45:14Z) - Information Consumption and Social Response in a Segregated Environment:
the Case of Gab [74.5095691235917]
This work provides a characterization of the interaction patterns within Gab around the COVID-19 topic.
We find that there are no strong statistical differences in the social response to questionable and reliable content.
Our results provide insights toward the understanding of coordinated inauthentic behavior and on the early-warning of information operation.
arXiv Detail & Related papers (2020-06-03T11:34:25Z) - Joint Modelling of Emotion and Abusive Language Detection [26.18171134454037]
We present the first joint model of emotion and abusive language detection, experimenting in a multi-task learning framework.
Our results demonstrate that incorporating affective features leads to significant improvements in abuse detection performance across datasets.
arXiv Detail & Related papers (2020-05-28T14:08:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.