Examining Temporal Bias in Abusive Language Detection
- URL: http://arxiv.org/abs/2309.14146v1
- Date: Mon, 25 Sep 2023 13:59:39 GMT
- Title: Examining Temporal Bias in Abusive Language Detection
- Authors: Mali Jin, Yida Mu, Diana Maynard, Kalina Bontcheva
- Abstract summary: Machine learning models have been developed to automatically detect abusive language.
These models can suffer from temporal bias, the phenomenon in which topics, language use or social norms change over time.
This study investigates the nature and impact of temporal bias in abusive language detection across various languages.
- Score: 3.465144840147315
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The use of abusive language online has become an increasingly pervasive
problem that damages both individuals and society, with effects ranging from
psychological harm right through to escalation to real-life violence and even
death. Machine learning models have been developed to automatically detect
abusive language, but these models can suffer from temporal bias, the
phenomenon in which topics, language use or social norms change over time. This
study aims to investigate the nature and impact of temporal bias in abusive
language detection across various languages and explore mitigation methods. We
evaluate the performance of models on abusive data sets from different time
periods. Our results demonstrate that temporal bias is a significant challenge
for abusive language detection, with models trained on historical data showing
a significant drop in performance over time. We also present an extensive
linguistic analysis of these abusive data sets from a diachronic perspective,
aiming to explore the reasons for language evolution and performance decline.
This study sheds light on the pervasive issue of temporal bias in abusive
language detection across languages, offering crucial insights into language
evolution and temporal bias mitigation.
Related papers
- Misspellings in Natural Language Processing: A survey [52.419589623702336]
misspellings have become ubiquitous in digital communication.
We reconstruct a history of misspellings as a scientific problem.
We discuss the latest advancements to address the challenge of misspellings in NLP.
arXiv Detail & Related papers (2025-01-28T10:26:04Z) - Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation [6.781972039785424]
Recent generative large language models (LLMs) show remarkable performance in non-English languages.
When prompted in those languages they tend to express higher harmful social biases and toxicity levels.
We investigate the impact of different finetuning methods on the model's bias and toxicity, but also on its ability to produce fluent and diverse text.
arXiv Detail & Related papers (2024-12-18T17:05:08Z) - Are Language Models Agnostic to Linguistically Grounded Perturbations? A Case Study of Indic Languages [47.45957604683302]
We study whether pre-trained language models are agnostic to linguistically grounded attacks or not.
Our findings reveal that although PLMs are susceptible to linguistic perturbations, when compared to non-linguistic attacks, PLMs exhibit a slightly lower susceptibility to linguistic attacks.
arXiv Detail & Related papers (2024-12-14T12:10:38Z) - Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models [113.58052868898173]
We identify and characterize a phenomenon never discussed before, where models leak irrelevant information from the prompt into the generation in unexpected ways.
We propose an evaluation setting to detect semantic leakage both by humans and automatically, curate a diverse test suite for diagnosing this behavior, and measure significant semantic leakage in 13 flagship models.
arXiv Detail & Related papers (2024-08-12T22:30:55Z) - Quantifying the Dialect Gap and its Correlates Across Languages [69.18461982439031]
This work will lay the foundation for furthering the field of dialectal NLP by laying out evident disparities and identifying possible pathways for addressing them through mindful data collection.
arXiv Detail & Related papers (2023-10-23T17:42:01Z) - No Time Like the Present: Effects of Language Change on Automated
Comment Moderation [0.0]
The spread of online hate has become a significant problem for newspapers that host comment sections.
There is growing interest in using machine learning and natural language processing for automated abusive language detection.
We show using a new German newspaper comments dataset that the classifiers trained with naive ML techniques will underperform on future data.
arXiv Detail & Related papers (2022-07-08T16:39:21Z) - Characteristics of Harmful Text: Towards Rigorous Benchmarking of
Language Models [32.960462266615096]
Large language models produce human-like text that drive a growing number of applications.
Recent literature and, increasingly, real world observations have demonstrated that these models can generate language that is toxic, biased, untruthful or otherwise harmful.
We outline six ways of characterizing harmful text which merit explicit consideration when designing new benchmarks.
arXiv Detail & Related papers (2022-06-16T17:28:01Z) - Data Bootstrapping Approaches to Improve Low Resource Abusive Language
Detection for Indic Languages [5.51252705016179]
We demonstrate a large-scale analysis of multilingual abusive speech in Indic languages.
We examine different interlingual transfer mechanisms and observe the performance of various multilingual models for abusive speech detection.
arXiv Detail & Related papers (2022-04-26T18:56:01Z) - Challenges in Automated Debiasing for Toxic Language Detection [81.04406231100323]
Biased associations have been a challenge in the development of classifiers for detecting toxic language.
We investigate recently introduced debiasing methods for text classification datasets and models, as applied to toxic language detection.
Our focus is on lexical (e.g., swear words, slurs, identity mentions) and dialectal markers (specifically African American English)
arXiv Detail & Related papers (2021-01-29T22:03:17Z) - Joint Modelling of Emotion and Abusive Language Detection [26.18171134454037]
We present the first joint model of emotion and abusive language detection, experimenting in a multi-task learning framework.
Our results demonstrate that incorporating affective features leads to significant improvements in abuse detection performance across datasets.
arXiv Detail & Related papers (2020-05-28T14:08:40Z) - Limits of Detecting Text Generated by Large-Scale Language Models [65.46403462928319]
Some consider large-scale language models that can generate long and coherent pieces of text as dangerous, since they may be used in misinformation campaigns.
Here we formulate large-scale language model output detection as a hypothesis testing problem to classify text as genuine or generated.
arXiv Detail & Related papers (2020-02-09T19:53:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.