Joint Multiclass Debiasing of Word Embeddings
- URL: http://arxiv.org/abs/2003.11520v1
- Date: Mon, 9 Mar 2020 22:06:37 GMT
- Title: Joint Multiclass Debiasing of Word Embeddings
- Authors: Radomir Popovi\'c, Florian Lemmerich and Markus Strohmaier
- Abstract summary: We present a joint multiclass debiasing approach capable of debiasing multiple bias dimensions simultaneously.
We show that our concepts can both reduce or even completely eliminate bias, while maintaining meaningful relationships between vectors in word embeddings.
- Score: 5.1135133995376085
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bias in Word Embeddings has been a subject of recent interest, along with
efforts for its reduction. Current approaches show promising progress towards
debiasing single bias dimensions such as gender or race. In this paper, we
present a joint multiclass debiasing approach that is capable of debiasing
multiple bias dimensions simultaneously. In that direction, we present two
approaches, HardWEAT and SoftWEAT, that aim to reduce biases by minimizing the
scores of the Word Embeddings Association Test (WEAT). We demonstrate the
viability of our methods by debiasing Word Embeddings on three classes of
biases (religion, gender and race) in three different publicly available word
embeddings and show that our concepts can both reduce or even completely
eliminate bias, while maintaining meaningful relationships between vectors in
word embeddings. Our work strengthens the foundation for more unbiased neural
representations of textual data.
Related papers
- Debiasing Sentence Embedders through Contrastive Word Pairs [46.9044612783003]
We explore an approach to remove linear and nonlinear bias information for NLP solutions.
We compare our approach to common debiasing methods on classical bias metrics and on bias metrics which take nonlinear information into account.
arXiv Detail & Related papers (2024-03-27T13:34:59Z) - What Do Llamas Really Think? Revealing Preference Biases in Language
Model Representations [62.91799637259657]
Do large language models (LLMs) exhibit sociodemographic biases, even when they decline to respond?
We study this research question by probing contextualized embeddings and exploring whether this bias is encoded in its latent representations.
We propose a logistic Bradley-Terry probe which predicts word pair preferences of LLMs from the words' hidden vectors.
arXiv Detail & Related papers (2023-11-30T18:53:13Z) - Detecting and Mitigating Indirect Stereotypes in Word Embeddings [6.428026202398116]
Societal biases in the usage of words, including harmful stereotypes, are frequently learned by common word embedding methods.
We propose a novel method called Biased Indirect Relationship Modification (BIRM) to mitigate indirect bias in distributional word embeddings.
arXiv Detail & Related papers (2023-05-23T23:23:49Z) - The SAME score: Improved cosine based bias score for word embeddings [49.75878234192369]
We introduce SAME, a novel bias score for semantic bias in embeddings.
We show that SAME is capable of measuring semantic bias and identify potential causes for social bias in downstream tasks.
arXiv Detail & Related papers (2022-03-28T09:28:13Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - "Thy algorithm shalt not bear false witness": An Evaluation of
Multiclass Debiasing Methods on Word Embeddings [3.0204693431381515]
The paper investigates the state-of-the-art multiclass debiasing techniques: Hard debiasing, SoftWEAT debiasing and Conceptor debiasing.
It evaluates their performance when removing religious bias on a common basis by quantifying bias removal via the Word Embedding Association Test (WEAT), Mean Average Cosine Similarity (MAC) and the Relative Negative Sentiment Bias (RNSB)
arXiv Detail & Related papers (2020-10-30T12:49:39Z) - Towards Debiasing Sentence Representations [109.70181221796469]
We show that Sent-Debias is effective in removing biases, and at the same time, preserves performance on sentence-level downstream tasks.
We hope that our work will inspire future research on characterizing and removing social biases from widely adopted sentence representations for fairer NLP.
arXiv Detail & Related papers (2020-07-16T04:22:30Z) - Detecting Emergent Intersectional Biases: Contextualized Word Embeddings
Contain a Distribution of Human-like Biases [10.713568409205077]
State-of-the-art neural language models generate dynamic word embeddings dependent on the context in which the word appears.
We introduce the Contextualized Embedding Association Test (CEAT), that can summarize the magnitude of overall bias in neural language models.
We develop two methods, Intersectional Bias Detection (IBD) and Emergent Intersectional Bias Detection (EIBD), to automatically identify the intersectional biases and emergent intersectional biases from static word embeddings.
arXiv Detail & Related papers (2020-06-06T19:49:50Z) - Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased
Proximities in Word Embeddings [37.65897382453336]
Existing post-processing methods for debiasing word embeddings are unable to mitigate gender bias hidden in the spatial arrangement of word vectors.
We propose RAN-Debias, a novel gender debiasing methodology which eliminates the bias present in a word vector but also alters the spatial distribution of its neighbouring vectors.
We also propose a new bias evaluation metric - Gender-based Illicit Proximity Estimate (GIPE)
arXiv Detail & Related papers (2020-06-02T20:50:43Z) - Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation [94.98656228690233]
We propose a technique that purifies the word embeddings against corpus regularities prior to inferring and removing the gender subspace.
Our approach preserves the distributional semantics of the pre-trained word embeddings while reducing gender bias to a significantly larger degree than prior approaches.
arXiv Detail & Related papers (2020-05-03T02:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.