Debiasing Sentence Embedders through Contrastive Word Pairs
- URL: http://arxiv.org/abs/2403.18555v1
- Date: Wed, 27 Mar 2024 13:34:59 GMT
- Title: Debiasing Sentence Embedders through Contrastive Word Pairs
- Authors: Philip Kenneweg, Sarah Schröder, Alexander Schulz, Barbara Hammer,
- Abstract summary: We explore an approach to remove linear and nonlinear bias information for NLP solutions.
We compare our approach to common debiasing methods on classical bias metrics and on bias metrics which take nonlinear information into account.
- Score: 46.9044612783003
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Over the last years, various sentence embedders have been an integral part in the success of current machine learning approaches to Natural Language Processing (NLP). Unfortunately, multiple sources have shown that the bias, inherent in the datasets upon which these embedding methods are trained, is learned by them. A variety of different approaches to remove biases in embeddings exists in the literature. Most of these approaches are applicable to word embeddings and in fewer cases to sentence embeddings. It is problematic that most debiasing approaches are directly transferred from word embeddings, therefore these approaches fail to take into account the nonlinear nature of sentence embedders and the embeddings they produce. It has been shown in literature that bias information is still present if sentence embeddings are debiased using such methods. In this contribution, we explore an approach to remove linear and nonlinear bias information for NLP solutions, without impacting downstream performance. We compare our approach to common debiasing methods on classical bias metrics and on bias metrics which take nonlinear information into account.
Related papers
- Language-guided Detection and Mitigation of Unknown Dataset Bias [23.299264313976213]
We propose a framework to identify potential biases as keywords without prior knowledge based on the partial occurrence in the captions.
Our framework not only outperforms existing methods without prior knowledge, but also is even comparable with a method that assumes prior knowledge.
arXiv Detail & Related papers (2024-06-05T03:11:33Z) - Projective Methods for Mitigating Gender Bias in Pre-trained Language Models [10.418595661963062]
Projective methods are fast to implement, use a small number of saved parameters, and make no updates to the existing model parameters.
We find that projective methods can be effective at both intrinsic bias and downstream bias mitigation, but that the two outcomes are not necessarily correlated.
arXiv Detail & Related papers (2024-03-27T17:49:31Z) - Revisiting the Dataset Bias Problem from a Statistical Perspective [72.94990819287551]
We study the "dataset bias" problem from a statistical standpoint.
We identify the main cause of the problem as the strong correlation between a class attribute u and a non-class attribute b.
We propose to mitigate dataset bias via either weighting the objective of each sample n by frac1p(u_n|b_n) or sampling that sample with a weight proportional to frac1p(u_n|b_n).
arXiv Detail & Related papers (2024-02-05T22:58:06Z) - A Bayesian approach to uncertainty in word embedding bias estimation [0.0]
Multiple measures, such as WEAT or MAC, attempt to quantify the magnitude of bias present in word embeddings in terms of a single-number metric.
We show that similar results can be easily obtained using such methods even if the data are generated by a null model lacking the intended bias.
We propose a Bayesian alternative: hierarchical Bayesian modeling, which enables a more uncertainty-sensitive inspection of bias in word embeddings.
arXiv Detail & Related papers (2023-06-15T11:48:50Z) - Marked Attribute Bias in Natural Language Inference [0.0]
We present a new observation of gender bias in a downstream NLP application: marked attribute bias in natural language inference.
Bias in downstream applications can stem from training data, word embeddings, or be amplified by the model in use.
Here we seek to understand how the intrinsic properties of word embeddings contribute to this observed marked attribute effect.
arXiv Detail & Related papers (2021-09-28T20:45:02Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - OSCaR: Orthogonal Subspace Correction and Rectification of Biases in
Word Embeddings [47.721931801603105]
We propose OSCaR, a bias-mitigating method that focuses on disentangling biased associations between concepts instead of removing concepts wholesale.
Our experiments on gender biases show that OSCaR is a well-balanced approach that ensures that semantic information is retained in the embeddings and bias is also effectively mitigated.
arXiv Detail & Related papers (2020-06-30T18:18:13Z) - MDR Cluster-Debias: A Nonlinear WordEmbedding Debiasing Pipeline [3.180013942295509]
Existing methods for debiasing word embeddings often do so only superficially, in that words that are stereotypically associated with a particular gender can still be clustered together in the debiased space.
This paper explores why this residual clustering exists, and how it might be addressed.
We identify two potential reasons for which residual bias exists and develop a new pipeline, MDR Cluster-Debias, to mitigate this bias.
arXiv Detail & Related papers (2020-06-20T20:03:07Z) - Towards Robustifying NLI Models Against Lexical Dataset Biases [94.79704960296108]
This paper explores both data-level and model-level debiasing methods to robustify models against lexical dataset biases.
First, we debias the dataset through data augmentation and enhancement, but show that the model bias cannot be fully removed via this method.
The second approach employs a bag-of-words sub-model to capture the features that are likely to exploit the bias and prevents the original model from learning these biased features.
arXiv Detail & Related papers (2020-05-10T17:56:10Z) - Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation [94.98656228690233]
We propose a technique that purifies the word embeddings against corpus regularities prior to inferring and removing the gender subspace.
Our approach preserves the distributional semantics of the pre-trained word embeddings while reducing gender bias to a significantly larger degree than prior approaches.
arXiv Detail & Related papers (2020-05-03T02:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.