Related papers: Mitigating Gender Bias in Contextual Word Embeddings

Mitigating Gender Bias in Contextual Word Embeddings

URL: http://arxiv.org/abs/2411.12074v1
Date: Mon, 18 Nov 2024 21:36:44 GMT
Title: Mitigating Gender Bias in Contextual Word Embeddings
Authors: Navya Yarrabelly, Vinay Damodaran, Feng-Guang Su,
Abstract summary: We propose a novel objective function for Lipstick(Masked-Language Modeling) which largely mitigates the gender bias in contextual embeddings. We also propose new methods for debiasing static embeddings and provide empirical proof via extensive analysis and experiments.
Score: 1.208453901299241
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Word embeddings have been shown to produce remarkable results in tackling a vast majority of NLP related tasks. Unfortunately, word embeddings also capture the stereotypical biases that are prevalent in society, affecting the predictive performance of the embeddings when used in downstream tasks. While various techniques have been proposed \cite{bolukbasi2016man, zhao2018learning} and criticized\cite{gonen2019lipstick} for static embeddings, very little work has focused on mitigating bias in contextual embeddings. In this paper, we propose a novel objective function for MLM(Masked-Language Modeling) which largely mitigates the gender bias in contextual embeddings and also preserves the performance for downstream tasks. Since previous works on measuring bias in contextual embeddings lack in normative reasoning, we also propose novel evaluation metrics that are straight-forward and aligned with our motivations in debiasing. We also propose new methods for debiasing static embeddings and provide empirical proof via extensive analysis and experiments, as to why the main source of bias in static embeddings stems from the presence of stereotypical names rather than gendered words themselves. All experiments and embeddings studied are in English, unless otherwise specified.\citep{bender2011achieving}.

Related papers

GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models [75.04426753720553]
We propose a framework to identify, quantify, and explain biases in an open set setting. This pipeline leverages a Large Language Model (LLM) to propose biases starting from a set of captions. We show two variations of this framework: OpenBias and GradBias.
arXiv Detail & Related papers (2024-08-29T16:51:07Z)
The Impact of Debiasing on the Performance of Language Models in Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets. Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z)
Detecting and Mitigating Indirect Stereotypes in Word Embeddings [6.428026202398116]
Societal biases in the usage of words, including harmful stereotypes, are frequently learned by common word embedding methods. We propose a novel method called Biased Indirect Relationship Modification (BIRM) to mitigate indirect bias in distributional word embeddings.
arXiv Detail & Related papers (2023-05-23T23:23:49Z)
The SAME score: Improved cosine based bias score for word embeddings [49.75878234192369]
We introduce SAME, a novel bias score for semantic bias in embeddings. We show that SAME is capable of measuring semantic bias and identify potential causes for social bias in downstream tasks.
arXiv Detail & Related papers (2022-03-28T09:28:13Z)
Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race. Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables. This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z)
Evaluating Bias In Dutch Word Embeddings [2.5567566997688043]
We implement the Word Embeddings Association Test (WEAT), Clustering and Sentence Embeddings Association Test (SEAT) methods to quantify the gender bias in Dutch word embeddings. We analyze the effect of the debiasing techniques on downstream tasks which show a negligible impact on traditional embeddings and a 2% decrease in performance in contextualized embeddings.
arXiv Detail & Related papers (2020-10-31T11:14:16Z)
Fair Embedding Engine: A Library for Analyzing and Mitigating Gender Bias in Word Embeddings [16.49645205111334]
Non-contextual word embedding models have been shown to inherit human-like stereotypical biases of gender, race and religion from the training corpora. This paper describes Fair Embedding Engine (FEE), a library for analysing and mitigating gender bias in word embeddings.
arXiv Detail & Related papers (2020-10-25T17:31:12Z)
Towards Debiasing Sentence Representations [109.70181221796469]
We show that Sent-Debias is effective in removing biases, and at the same time, preserves performance on sentence-level downstream tasks. We hope that our work will inspire future research on characterizing and removing social biases from widely adopted sentence representations for fairer NLP.
arXiv Detail & Related papers (2020-07-16T04:22:30Z)
Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation [94.98656228690233]
We propose a technique that purifies the word embeddings against corpus regularities prior to inferring and removing the gender subspace. Our approach preserves the distributional semantics of the pre-trained word embeddings while reducing gender bias to a significantly larger degree than prior approaches.
arXiv Detail & Related papers (2020-05-03T02:33:20Z)
Neutralizing Gender Bias in Word Embedding with Latent Disentanglement and Counterfactual Generation [25.060917870666803]
We introduce a siamese auto-encoder structure with an adapted gradient reversal layer. Our structure enables the separation of the semantic latent information and gender latent information of given word into the disjoint latent dimensions.
arXiv Detail & Related papers (2020-04-07T05:16:48Z)
Joint Multiclass Debiasing of Word Embeddings [5.1135133995376085]
We present a joint multiclass debiasing approach capable of debiasing multiple bias dimensions simultaneously. We show that our concepts can both reduce or even completely eliminate bias, while maintaining meaningful relationships between vectors in word embeddings.
arXiv Detail & Related papers (2020-03-09T22:06:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.