Handling Bias in Toxic Speech Detection: A Survey
- URL: http://arxiv.org/abs/2202.00126v3
- Date: Sun, 15 Jan 2023 14:51:55 GMT
- Title: Handling Bias in Toxic Speech Detection: A Survey
- Authors: Tanmay Garg, Sarah Masud, Tharun Suresh, Tanmoy Chakraborty
- Abstract summary: We look at proposed methods for evaluating and mitigating bias in toxic speech detection.
Case study introduces the concept of bias shift due to knowledge-based bias mitigation.
Survey concludes with an overview of the critical challenges, research gaps, and future directions.
- Score: 26.176340438312376
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting online toxicity has always been a challenge due to its inherent
subjectivity. Factors such as the context, geography, socio-political climate,
and background of the producers and consumers of the posts play a crucial role
in determining if the content can be flagged as toxic. Adoption of automated
toxicity detection models in production can thus lead to a sidelining of the
various groups they aim to help in the first place. It has piqued researchers'
interest in examining unintended biases and their mitigation. Due to the
nascent and multi-faceted nature of the work, complete literature is chaotic in
its terminologies, techniques, and findings. In this paper, we put together a
systematic study of the limitations and challenges of existing methods for
mitigating bias in toxicity detection.
We look closely at proposed methods for evaluating and mitigating bias in
toxic speech detection. To examine the limitations of existing methods, we also
conduct a case study to introduce the concept of bias shift due to
knowledge-based bias mitigation. The survey concludes with an overview of the
critical challenges, research gaps, and future directions. While reducing
toxicity on online platforms continues to be an active area of research, a
systematic study of various biases and their mitigation strategies will help
the research community produce robust and fair models.
Related papers
- Bias in Large Language Models: Origin, Evaluation, and Mitigation [4.606140332500086]
Large Language Models (LLMs) have revolutionized natural language processing, but their susceptibility to biases poses significant challenges.
This comprehensive review examines the landscape of bias in LLMs, from its origins to current mitigation strategies.
Ethical and legal implications of biased LLMs are discussed, emphasizing potential harms in real-world applications such as healthcare and criminal justice.
arXiv Detail & Related papers (2024-11-16T23:54:53Z) - A Survey of Stance Detection on Social Media: New Directions and Perspectives [50.27382951812502]
stance detection has emerged as a crucial subfield within affective computing.
Recent years have seen a surge of research interest in developing effective stance detection methods.
This paper provides a comprehensive survey of stance detection techniques on social media.
arXiv Detail & Related papers (2024-09-24T03:06:25Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - A Taxonomy of Rater Disagreements: Surveying Challenges & Opportunities
from the Perspective of Annotating Online Toxicity [15.23055494327071]
Toxicity is an increasingly common and severe issue in online spaces.
A rich line of machine learning research has focused on computationally detecting and mitigating online toxicity.
Recent research has pointed out the importance of accounting for the subjective nature of this task.
arXiv Detail & Related papers (2023-11-07T21:00:51Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - On Bias and Fairness in NLP: Investigating the Impact of Bias and Debiasing in Language Models on the Fairness of Toxicity Detection [7.297345802761503]
representation bias, selection bias and overamplification bias are investigated.
We show that overamplification bias is the most impactful type of bias on the fairness of the task of toxicity detection.
We introduce a list of guidelines to ensure the fairness of the task of toxicity detection.
arXiv Detail & Related papers (2023-05-22T08:44:00Z) - Toxicity Detection with Generative Prompt-based Inference [3.9741109244650823]
It is a long-known risk that language models (LMs), once trained on corpus containing undesirable content, have the power to manifest biases and toxicity.
In this work, we explore the generative variant of zero-shot prompt-based toxicity detection with comprehensive trials on prompt engineering.
arXiv Detail & Related papers (2022-05-24T22:44:43Z) - Anatomizing Bias in Facial Analysis [86.79402670904338]
Existing facial analysis systems have been shown to yield biased results against certain demographic subgroups.
It has become imperative to ensure that these systems do not discriminate based on gender, identity, or skin tone of individuals.
This has led to research in the identification and mitigation of bias in AI systems.
arXiv Detail & Related papers (2021-12-13T09:51:13Z) - Mitigating Biases in Toxic Language Detection through Invariant
Rationalization [70.36701068616367]
biases toward some attributes, including gender, race, and dialect, exist in most training datasets for toxicity detection.
We propose to use invariant rationalization (InvRat), a game-theoretic framework consisting of a rationale generator and a predictor, to rule out the spurious correlation of certain syntactic patterns.
Our method yields lower false positive rate in both lexical and dialectal attributes than previous debiasing methods.
arXiv Detail & Related papers (2021-06-14T08:49:52Z) - Challenges in Automated Debiasing for Toxic Language Detection [81.04406231100323]
Biased associations have been a challenge in the development of classifiers for detecting toxic language.
We investigate recently introduced debiasing methods for text classification datasets and models, as applied to toxic language detection.
Our focus is on lexical (e.g., swear words, slurs, identity mentions) and dialectal markers (specifically African American English)
arXiv Detail & Related papers (2021-01-29T22:03:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.