Stereotypical Bias Removal for Hate Speech Detection Task using
Knowledge-based Generalizations
- URL: http://arxiv.org/abs/2001.05495v1
- Date: Wed, 15 Jan 2020 18:17:36 GMT
- Title: Stereotypical Bias Removal for Hate Speech Detection Task using
Knowledge-based Generalizations
- Authors: Pinkesh Badjatiya, Manish Gupta, Vasudeva Varma
- Abstract summary: We study bias mitigation from unstructured text data for hate speech detection.
We propose novel methods leveraging knowledge-based generalizations for bias-free learning.
Our experiments with two real-world datasets, a Wikipedia Talk Pages dataset and a Twitter dataset, show that the use of knowledge-based generalizations results in better performance.
- Score: 16.304516254043865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the ever-increasing cases of hate spread on social media platforms, it
is critical to design abuse detection mechanisms to proactively avoid and
control such incidents. While there exist methods for hate speech detection,
they stereotype words and hence suffer from inherently biased training. Bias
removal has been traditionally studied for structured datasets, but we aim at
bias mitigation from unstructured text data. In this paper, we make two
important contributions. First, we systematically design methods to quantify
the bias for any model and propose algorithms for identifying the set of words
which the model stereotypes. Second, we propose novel methods leveraging
knowledge-based generalizations for bias-free learning. Knowledge-based
generalization provides an effective way to encode knowledge because the
abstraction they provide not only generalizes content but also facilitates
retraction of information from the hate speech detection classifier, thereby
reducing the imbalance. We experiment with multiple knowledge generalization
policies and analyze their effect on general performance and in mitigating
bias. Our experiments with two real-world datasets, a Wikipedia Talk Pages
dataset (WikiDetox) of size ~96k and a Twitter dataset of size ~24k, show that
the use of knowledge-based generalizations results in better performance by
forcing the classifier to learn from generalized content. Our methods utilize
existing knowledge-bases and can easily be extended to other tasks
Related papers
- HateDebias: On the Diversity and Variability of Hate Speech Debiasing [14.225997610785354]
We propose a benchmark, named HateDebias, to analyze the model ability of hate speech detection under continuous, changing environments.
Specifically, to meet the diversity of biases, we collect existing hate speech detection datasets with different types of biases.
We evaluate the detection accuracy of models trained on the datasets with a single type of bias with the performance on the HateDebias, where a significant performance drop is observed.
arXiv Detail & Related papers (2024-06-07T12:18:02Z) - Language-guided Detection and Mitigation of Unknown Dataset Bias [23.299264313976213]
We propose a framework to identify potential biases as keywords without prior knowledge based on the partial occurrence in the captions.
Our framework not only outperforms existing methods without prior knowledge, but also is even comparable with a method that assumes prior knowledge.
arXiv Detail & Related papers (2024-06-05T03:11:33Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - Look Beyond Bias with Entropic Adversarial Data Augmentation [4.893694715581673]
Deep neural networks do not discriminate between spurious and causal patterns, and will only learn the most predictive ones while ignoring the others.
Debiasing methods were developed to make networks robust to such spurious biases but require to know in advance if a dataset is biased.
In this paper, we argue that such samples should not be necessarily needed because the ''hidden'' causal information is often also contained in biased images.
arXiv Detail & Related papers (2023-01-10T08:25:24Z) - Power of Explanations: Towards automatic debiasing in hate speech
detection [19.26084350822197]
Hate speech detection is a common downstream application of natural language processing (NLP) in the real world.
We propose an automatic misuse detector (MiD) relying on an explanation method for detecting potential bias.
arXiv Detail & Related papers (2022-09-07T14:14:03Z) - ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate
Speech Detection [85.68684067031909]
We frame this problem as a few-shot learning task, and show significant gains with decomposing the task into its "constituent" parts.
In addition, we see that infusing knowledge from reasoning datasets (e.g. Atomic 2020) improves the performance even further.
arXiv Detail & Related papers (2022-05-25T05:10:08Z) - Pseudo Bias-Balanced Learning for Debiased Chest X-ray Classification [57.53567756716656]
We study the problem of developing debiased chest X-ray diagnosis models without knowing exactly the bias labels.
We propose a novel algorithm, pseudo bias-balanced learning, which first captures and predicts per-sample bias labels.
Our proposed method achieved consistent improvements over other state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-18T11:02:18Z) - Exploring Strategies for Generalizable Commonsense Reasoning with
Pre-trained Models [62.28551903638434]
We measure the impact of three different adaptation methods on the generalization and accuracy of models.
Experiments with two models show that fine-tuning performs best, by learning both the content and the structure of the task, but suffers from overfitting and limited generalization to novel answers.
We observe that alternative adaptation methods like prefix-tuning have comparable accuracy, but generalize better to unseen answers and are more robust to adversarial splits.
arXiv Detail & Related papers (2021-09-07T03:13:06Z) - Towards Measuring Bias in Image Classification [61.802949761385]
Convolutional Neural Networks (CNN) have become state-of-the-art for the main computer vision tasks.
However, due to the complex structure their decisions are hard to understand which limits their use in some context of the industrial world.
We present a systematic approach to uncover data bias by means of attribution maps.
arXiv Detail & Related papers (2021-07-01T10:50:39Z) - Detecting and Understanding Generalization Barriers for Neural Machine
Translation [53.23463279153577]
This paper attempts to identify and understand generalization barrier words within an unseen input sentence.
We propose a principled definition of generalization barrier words and a modified version which is tractable in computation.
We then conduct extensive analyses on those detected generalization barrier words on both Zh$Leftrightarrow$En NIST benchmarks.
arXiv Detail & Related papers (2020-04-05T12:33:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.