Enriching Abusive Language Detection with Community Context
- URL: http://arxiv.org/abs/2206.08445v1
- Date: Thu, 16 Jun 2022 20:54:02 GMT
- Title: Enriching Abusive Language Detection with Community Context
- Authors: Jana Kurrek, Haji Mohammad Saleem, and Derek Ruths
- Abstract summary: Use of pejorative expressions can be benign or actively empowering.
Models for abuse detection misclassify these expressions as derogatory, inadvertently censor productive conversations held by marginalized groups.
Our paper highlights how community context can improve classification outcomes in abusive language detection.
- Score: 0.3708656266586145
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Uses of pejorative expressions can be benign or actively empowering. When
models for abuse detection misclassify these expressions as derogatory, they
inadvertently censor productive conversations held by marginalized groups. One
way to engage with non-dominant perspectives is to add context around
conversations. Previous research has leveraged user- and thread-level features,
but it often neglects the spaces within which productive conversations take
place. Our paper highlights how community context can improve classification
outcomes in abusive language detection. We make two main contributions to this
end. First, we demonstrate that online communities cluster by the nature of
their support towards victims of abuse. Second, we establish how community
context improves accuracy and reduces the false positive rates of
state-of-the-art abusive language classifiers. These findings suggest a
promising direction for context-aware models in abusive language research.
Related papers
- Analyzing Norm Violations in Live-Stream Chat [49.120561596550395]
We study the first NLP study dedicated to detecting norm violations in conversations on live-streaming platforms.
We define norm violation categories in live-stream chats and annotate 4,583 moderated comments from Twitch.
Our results show that appropriate contextual information can boost moderation performance by 35%.
arXiv Detail & Related papers (2023-05-18T05:58:27Z) - CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a
Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations.
We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z) - A Keyword Based Approach to Understanding the Overpenalization of
Marginalized Groups by English Marginal Abuse Models on Twitter [2.9604738405097333]
Harmful content detection models tend to have higher false positive rates for content from marginalized groups.
We propose a principled approach to detecting and measuring the severity of potential harms associated with a text-based model.
We apply our methodology to audit Twitter's English marginal abuse model, which is used for removing amplification eligibility of marginally abusive content.
arXiv Detail & Related papers (2022-10-07T20:28:00Z) - Improved two-stage hate speech classification for twitter based on Deep
Neural Networks [0.0]
Hate speech is a form of online harassment that involves the use of abusive language.
The model we propose in this work is an extension of an existing approach based on LSTM neural network architectures.
Our study includes a performance comparison of several proposed alternative methods for the second stage evaluated on a public corpus of 16k tweets.
arXiv Detail & Related papers (2022-06-08T20:57:41Z) - Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable
Topics for the Russian Language [76.58220021791955]
We present two text collections labelled according to binary notion of inapropriateness and a multinomial notion of sensitive topic.
To objectivise the notion of inappropriateness, we define it in a data-driven way though crowdsourcing.
arXiv Detail & Related papers (2022-03-04T15:59:06Z) - Mitigating Biases in Toxic Language Detection through Invariant
Rationalization [70.36701068616367]
biases toward some attributes, including gender, race, and dialect, exist in most training datasets for toxicity detection.
We propose to use invariant rationalization (InvRat), a game-theoretic framework consisting of a rationale generator and a predictor, to rule out the spurious correlation of certain syntactic patterns.
Our method yields lower false positive rate in both lexical and dialectal attributes than previous debiasing methods.
arXiv Detail & Related papers (2021-06-14T08:49:52Z) - Abusive Language Detection in Heterogeneous Contexts: Dataset Collection
and the Role of Supervised Attention [9.597481034467915]
Abusive language is a massive problem in online social platforms.
We provide an annotated dataset of abusive language in over 11,000 comments from YouTube.
We propose an algorithm that uses a supervised attention mechanism to detect and categorize abusive content.
arXiv Detail & Related papers (2021-05-24T06:50:19Z) - Abuse is Contextual, What about NLP? The Role of Context in Abusive
Language Annotation and Detection [2.793095554369281]
We investigate what happens when the hateful content of a message is judged also based on the context.
We first re-annotate part of a widely used dataset for abusive language detection in English in two conditions, i.e. with and without context.
arXiv Detail & Related papers (2021-03-27T14:31:52Z) - Challenges in Automated Debiasing for Toxic Language Detection [81.04406231100323]
Biased associations have been a challenge in the development of classifiers for detecting toxic language.
We investigate recently introduced debiasing methods for text classification datasets and models, as applied to toxic language detection.
Our focus is on lexical (e.g., swear words, slurs, identity mentions) and dialectal markers (specifically African American English)
arXiv Detail & Related papers (2021-01-29T22:03:17Z) - On Negative Interference in Multilingual Models: Findings and A
Meta-Learning Treatment [59.995385574274785]
We show that, contrary to previous belief, negative interference also impacts low-resource languages.
We present a meta-learning algorithm that obtains better cross-lingual transferability and alleviates negative interference.
arXiv Detail & Related papers (2020-10-06T20:48:58Z) - Joint Modelling of Emotion and Abusive Language Detection [26.18171134454037]
We present the first joint model of emotion and abusive language detection, experimenting in a multi-task learning framework.
Our results demonstrate that incorporating affective features leads to significant improvements in abuse detection performance across datasets.
arXiv Detail & Related papers (2020-05-28T14:08:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.