Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation
- URL: http://arxiv.org/abs/2005.00965v1
- Date: Sun, 3 May 2020 02:33:20 GMT
- Title: Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation
- Authors: Tianlu Wang, Xi Victoria Lin, Nazneen Fatema Rajani, Bryan McCann,
Vicente Ordonez, Caiming Xiong
- Abstract summary: We propose a technique that purifies the word embeddings against corpus regularities prior to inferring and removing the gender subspace.
Our approach preserves the distributional semantics of the pre-trained word embeddings while reducing gender bias to a significantly larger degree than prior approaches.
- Score: 94.98656228690233
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Word embeddings derived from human-generated corpora inherit strong gender
bias which can be further amplified by downstream models. Some commonly adopted
debiasing approaches, including the seminal Hard Debias algorithm, apply
post-processing procedures that project pre-trained word embeddings into a
subspace orthogonal to an inferred gender subspace. We discover that
semantic-agnostic corpus regularities such as word frequency captured by the
word embeddings negatively impact the performance of these algorithms. We
propose a simple but effective technique, Double Hard Debias, which purifies
the word embeddings against such corpus regularities prior to inferring and
removing the gender subspace. Experiments on three bias mitigation benchmarks
show that our approach preserves the distributional semantics of the
pre-trained word embeddings while reducing gender bias to a significantly
larger degree than prior approaches.
Related papers
- The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z) - Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous
Pronouns [53.62845317039185]
Bias-measuring datasets play a critical role in detecting biased behavior of language models.
We propose a novel method to collect diverse, natural, and minimally distant text pairs via counterfactual generation.
We show that four pre-trained language models are significantly more inconsistent across different gender groups than within each group.
arXiv Detail & Related papers (2023-02-11T12:11:03Z) - MABEL: Attenuating Gender Bias using Textual Entailment Data [20.489427903240017]
We propose MABEL, an intermediate pre-training approach for mitigating gender bias in contextualized representations.
Key to our approach is the use of a contrastive learning objective on counterfactually augmented, gender-balanced entailment pairs.
We show that MABEL outperforms previous task-agnostic debiasing approaches in terms of fairness.
arXiv Detail & Related papers (2022-10-26T18:36:58Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Dictionary-based Debiasing of Pre-trained Word Embeddings [28.378270372391498]
We propose a method for debiasing pre-trained word embeddings using dictionaries.
Our proposed method does not require the types of biases to be pre-defined in the form of word lists.
Experimental results on standard benchmark datasets show that the proposed method can accurately remove unfair biases encoded in pre-trained word embeddings.
arXiv Detail & Related papers (2021-01-23T15:44:23Z) - Exploring the Linear Subspace Hypothesis in Gender Bias Mitigation [57.292988892028134]
Bolukbasi et al. present one of the first gender bias mitigation techniques for word representations.
We generalize their method to a kernelized, nonlinear version.
We analyze empirically whether the bias subspace is actually linear.
arXiv Detail & Related papers (2020-09-20T14:13:45Z) - MDR Cluster-Debias: A Nonlinear WordEmbedding Debiasing Pipeline [3.180013942295509]
Existing methods for debiasing word embeddings often do so only superficially, in that words that are stereotypically associated with a particular gender can still be clustered together in the debiased space.
This paper explores why this residual clustering exists, and how it might be addressed.
We identify two potential reasons for which residual bias exists and develop a new pipeline, MDR Cluster-Debias, to mitigate this bias.
arXiv Detail & Related papers (2020-06-20T20:03:07Z) - Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased
Proximities in Word Embeddings [37.65897382453336]
Existing post-processing methods for debiasing word embeddings are unable to mitigate gender bias hidden in the spatial arrangement of word vectors.
We propose RAN-Debias, a novel gender debiasing methodology which eliminates the bias present in a word vector but also alters the spatial distribution of its neighbouring vectors.
We also propose a new bias evaluation metric - Gender-based Illicit Proximity Estimate (GIPE)
arXiv Detail & Related papers (2020-06-02T20:50:43Z) - Mitigating Gender Bias Amplification in Distribution by Posterior
Regularization [75.3529537096899]
We investigate the gender bias amplification issue from the distribution perspective.
We propose a bias mitigation approach based on posterior regularization.
Our study sheds the light on understanding the bias amplification.
arXiv Detail & Related papers (2020-05-13T11:07:10Z) - Neutralizing Gender Bias in Word Embedding with Latent Disentanglement
and Counterfactual Generation [25.060917870666803]
We introduce a siamese auto-encoder structure with an adapted gradient reversal layer.
Our structure enables the separation of the semantic latent information and gender latent information of given word into the disjoint latent dimensions.
arXiv Detail & Related papers (2020-04-07T05:16:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.