Investigating Gender Bias in BERT
- URL: http://arxiv.org/abs/2009.05021v1
- Date: Thu, 10 Sep 2020 17:38:32 GMT
- Title: Investigating Gender Bias in BERT
- Authors: Rishabh Bhardwaj, Navonil Majumder, Soujanya Poria
- Abstract summary: We analyse the gender-bias it induces in five downstream tasks related to emotion and sentiment intensity prediction.
We propose an algorithm that finds fine-grained gender directions, i.e., one primary direction for each BERT layer.
Experiments show that removing embedding components in such directions achieves great success in reducing BERT-induced bias in the downstream tasks.
- Score: 22.066477991442003
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Contextual language models (CLMs) have pushed the NLP benchmarks to a new
height. It has become a new norm to utilize CLM provided word embeddings in
downstream tasks such as text classification. However, unless addressed, CLMs
are prone to learn intrinsic gender-bias in the dataset. As a result,
predictions of downstream NLP models can vary noticeably by varying gender
words, such as replacing "he" to "she", or even gender-neutral words. In this
paper, we focus our analysis on a popular CLM, i.e., BERT. We analyse the
gender-bias it induces in five downstream tasks related to emotion and
sentiment intensity prediction. For each task, we train a simple regressor
utilizing BERT's word embeddings. We then evaluate the gender-bias in
regressors using an equity evaluation corpus. Ideally and from the specific
design, the models should discard gender informative features from the input.
However, the results show a significant dependence of the system's predictions
on gender-particular words and phrases. We claim that such biases can be
reduced by removing genderspecific features from word embedding. Hence, for
each layer in BERT, we identify directions that primarily encode gender
information. The space formed by such directions is referred to as the gender
subspace in the semantic space of word embeddings. We propose an algorithm that
finds fine-grained gender directions, i.e., one primary direction for each BERT
layer. This obviates the need of realizing gender subspace in multiple
dimensions and prevents other crucial information from being omitted.
Experiments show that removing embedding components in such directions achieves
great success in reducing BERT-induced bias in the downstream tasks.
Related papers
- Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - Probing Explicit and Implicit Gender Bias through LLM Conditional Text
Generation [64.79319733514266]
Large Language Models (LLMs) can generate biased and toxic responses.
We propose a conditional text generation mechanism without the need for predefined gender phrases and stereotypes.
arXiv Detail & Related papers (2023-11-01T05:31:46Z) - The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z) - Gender Bias in Text: Labeled Datasets and Lexicons [0.30458514384586394]
There is a lack of gender bias datasets and lexicons for automating the detection of gender bias.
We provide labeled datasets and exhaustive lexicons by collecting, annotating, and augmenting relevant sentences.
The released datasets and lexicons span multiple bias subtypes including: Generic He, Generic She, Explicit Marking of Sex, and Gendered Neologisms.
arXiv Detail & Related papers (2022-01-21T12:44:51Z) - Gendered Language in Resumes and its Implications for Algorithmic Bias
in Hiring [0.0]
We train a series of models to classify the gender of the applicant.
We investigate whether it is possible to obfuscate gender from resumes.
We find that there is a significant amount of gendered information in resumes even after obfuscation.
arXiv Detail & Related papers (2021-12-16T14:26:36Z) - Mitigating Gender Bias in Captioning Systems [56.25457065032423]
Most captioning models learn gender bias, leading to high gender prediction errors, especially for women.
We propose a new Guided Attention Image Captioning model (GAIC) which provides self-guidance on visual attention to encourage the model to capture correct gender visual evidence.
arXiv Detail & Related papers (2020-06-15T12:16:19Z) - Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased
Proximities in Word Embeddings [37.65897382453336]
Existing post-processing methods for debiasing word embeddings are unable to mitigate gender bias hidden in the spatial arrangement of word vectors.
We propose RAN-Debias, a novel gender debiasing methodology which eliminates the bias present in a word vector but also alters the spatial distribution of its neighbouring vectors.
We also propose a new bias evaluation metric - Gender-based Illicit Proximity Estimate (GIPE)
arXiv Detail & Related papers (2020-06-02T20:50:43Z) - Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation [94.98656228690233]
We propose a technique that purifies the word embeddings against corpus regularities prior to inferring and removing the gender subspace.
Our approach preserves the distributional semantics of the pre-trained word embeddings while reducing gender bias to a significantly larger degree than prior approaches.
arXiv Detail & Related papers (2020-05-03T02:33:20Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z) - Neutralizing Gender Bias in Word Embedding with Latent Disentanglement
and Counterfactual Generation [25.060917870666803]
We introduce a siamese auto-encoder structure with an adapted gradient reversal layer.
Our structure enables the separation of the semantic latent information and gender latent information of given word into the disjoint latent dimensions.
arXiv Detail & Related papers (2020-04-07T05:16:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.