Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased
Proximities in Word Embeddings
- URL: http://arxiv.org/abs/2006.01938v1
- Date: Tue, 2 Jun 2020 20:50:43 GMT
- Title: Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased
Proximities in Word Embeddings
- Authors: Vaibhav Kumar, Tenzin Singhay Bhotia, Vaibhav Kumar, Tanmoy
Chakraborty
- Abstract summary: Existing post-processing methods for debiasing word embeddings are unable to mitigate gender bias hidden in the spatial arrangement of word vectors.
We propose RAN-Debias, a novel gender debiasing methodology which eliminates the bias present in a word vector but also alters the spatial distribution of its neighbouring vectors.
We also propose a new bias evaluation metric - Gender-based Illicit Proximity Estimate (GIPE)
- Score: 37.65897382453336
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Word embeddings are the standard model for semantic and syntactic
representations of words. Unfortunately, these models have been shown to
exhibit undesirable word associations resulting from gender, racial, and
religious biases. Existing post-processing methods for debiasing word
embeddings are unable to mitigate gender bias hidden in the spatial arrangement
of word vectors. In this paper, we propose RAN-Debias, a novel gender debiasing
methodology which not only eliminates the bias present in a word vector but
also alters the spatial distribution of its neighbouring vectors, achieving a
bias-free setting while maintaining minimal semantic offset. We also propose a
new bias evaluation metric - Gender-based Illicit Proximity Estimate (GIPE),
which measures the extent of undue proximity in word vectors resulting from the
presence of gender-based predilections. Experiments based on a suite of
evaluation metrics show that RAN-Debias significantly outperforms the
state-of-the-art in reducing proximity bias (GIPE) by at least 42.02%. It also
reduces direct bias, adding minimal semantic disturbance, and achieves the best
performance in a downstream application task (coreference resolution).
Related papers
- Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z) - Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous
Pronouns [53.62845317039185]
Bias-measuring datasets play a critical role in detecting biased behavior of language models.
We propose a novel method to collect diverse, natural, and minimally distant text pairs via counterfactual generation.
We show that four pre-trained language models are significantly more inconsistent across different gender groups than within each group.
arXiv Detail & Related papers (2023-02-11T12:11:03Z) - MABEL: Attenuating Gender Bias using Textual Entailment Data [20.489427903240017]
We propose MABEL, an intermediate pre-training approach for mitigating gender bias in contextualized representations.
Key to our approach is the use of a contrastive learning objective on counterfactually augmented, gender-balanced entailment pairs.
We show that MABEL outperforms previous task-agnostic debiasing approaches in terms of fairness.
arXiv Detail & Related papers (2022-10-26T18:36:58Z) - Identifying and Mitigating Gender Bias in Hyperbolic Word Embeddings [34.378806636170616]
We extend the study of gender bias to the recently popularized hyperbolic word embeddings.
We propose gyrocosine bias, a novel measure for quantifying gender bias in hyperbolic word representations.
Experiments on a suit of evaluation tests show that Poincar'e Gender Debias (PGD) effectively reduces bias while adding a minimal semantic offset.
arXiv Detail & Related papers (2021-09-28T14:43:37Z) - The Gap on GAP: Tackling the Problem of Differing Data Distributions in
Bias-Measuring Datasets [58.53269361115974]
Diagnostic datasets that can detect biased models are an important prerequisite for bias reduction within natural language processing.
undesired patterns in the collected data can make such tests incorrect.
We introduce a theoretically grounded method for weighting test samples to cope with such patterns in the test data.
arXiv Detail & Related papers (2020-11-03T16:50:13Z) - Investigating Gender Bias in BERT [22.066477991442003]
We analyse the gender-bias it induces in five downstream tasks related to emotion and sentiment intensity prediction.
We propose an algorithm that finds fine-grained gender directions, i.e., one primary direction for each BERT layer.
Experiments show that removing embedding components in such directions achieves great success in reducing BERT-induced bias in the downstream tasks.
arXiv Detail & Related papers (2020-09-10T17:38:32Z) - MDR Cluster-Debias: A Nonlinear WordEmbedding Debiasing Pipeline [3.180013942295509]
Existing methods for debiasing word embeddings often do so only superficially, in that words that are stereotypically associated with a particular gender can still be clustered together in the debiased space.
This paper explores why this residual clustering exists, and how it might be addressed.
We identify two potential reasons for which residual bias exists and develop a new pipeline, MDR Cluster-Debias, to mitigate this bias.
arXiv Detail & Related papers (2020-06-20T20:03:07Z) - Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation [94.98656228690233]
We propose a technique that purifies the word embeddings against corpus regularities prior to inferring and removing the gender subspace.
Our approach preserves the distributional semantics of the pre-trained word embeddings while reducing gender bias to a significantly larger degree than prior approaches.
arXiv Detail & Related papers (2020-05-03T02:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.