OSCaR: Orthogonal Subspace Correction and Rectification of Biases in
Word Embeddings
- URL: http://arxiv.org/abs/2007.00049v2
- Date: Fri, 10 Sep 2021 22:17:00 GMT
- Title: OSCaR: Orthogonal Subspace Correction and Rectification of Biases in
Word Embeddings
- Authors: Sunipa Dev, Tao Li, Jeff M Phillips, Vivek Srikumar
- Abstract summary: We propose OSCaR, a bias-mitigating method that focuses on disentangling biased associations between concepts instead of removing concepts wholesale.
Our experiments on gender biases show that OSCaR is a well-balanced approach that ensures that semantic information is retained in the embeddings and bias is also effectively mitigated.
- Score: 47.721931801603105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Language representations are known to carry stereotypical biases and, as a
result, lead to biased predictions in downstream tasks. While existing methods
are effective at mitigating biases by linear projection, such methods are too
aggressive: they not only remove bias, but also erase valuable information from
word embeddings. We develop new measures for evaluating specific information
retention that demonstrate the tradeoff between bias removal and information
retention. To address this challenge, we propose OSCaR (Orthogonal Subspace
Correction and Rectification), a bias-mitigating method that focuses on
disentangling biased associations between concepts instead of removing concepts
wholesale. Our experiments on gender biases show that OSCaR is a well-balanced
approach that ensures that semantic information is retained in the embeddings
and bias is also effectively mitigated.
Related papers
- Language-guided Detection and Mitigation of Unknown Dataset Bias [23.299264313976213]
We propose a framework to identify potential biases as keywords without prior knowledge based on the partial occurrence in the captions.
Our framework not only outperforms existing methods without prior knowledge, but also is even comparable with a method that assumes prior knowledge.
arXiv Detail & Related papers (2024-06-05T03:11:33Z) - Projective Methods for Mitigating Gender Bias in Pre-trained Language Models [10.418595661963062]
Projective methods are fast to implement, use a small number of saved parameters, and make no updates to the existing model parameters.
We find that projective methods can be effective at both intrinsic bias and downstream bias mitigation, but that the two outcomes are not necessarily correlated.
arXiv Detail & Related papers (2024-03-27T17:49:31Z) - Is There a One-Model-Fits-All Approach to Information Extraction? Revisiting Task Definition Biases [62.806300074459116]
Definition bias is a negative phenomenon that can mislead models.
We identify two types of definition bias in IE: bias among information extraction datasets and bias between information extraction datasets and instruction tuning datasets.
We propose a multi-stage framework consisting of definition bias measurement, bias-aware fine-tuning, and task-specific bias mitigation.
arXiv Detail & Related papers (2024-03-25T03:19:20Z) - Causal Disentanglement for Semantics-Aware Intent Learning in
Recommendation [30.85573846018658]
We propose an unbiased and semantics-aware disentanglement learning called CaDSI.
CaDSI explicitly models the causal relations underlying recommendation task.
It produces semantics-aware representations via disentangling users true intents aware of specific item context.
arXiv Detail & Related papers (2022-02-05T15:17:03Z) - Linear Adversarial Concept Erasure [108.37226654006153]
We formulate the problem of identifying and erasing a linear subspace that corresponds to a given concept.
We show that the method is highly expressive, effectively mitigating bias in deep nonlinear classifiers while maintaining tractability and interpretability.
arXiv Detail & Related papers (2022-01-28T13:00:17Z) - Information-Theoretic Bias Reduction via Causal View of Spurious
Correlation [71.9123886505321]
We propose an information-theoretic bias measurement technique through a causal interpretation of spurious correlation.
We present a novel debiasing framework against the algorithmic bias, which incorporates a bias regularization loss.
The proposed bias measurement and debiasing approaches are validated in diverse realistic scenarios.
arXiv Detail & Related papers (2022-01-10T01:19:31Z) - Marked Attribute Bias in Natural Language Inference [0.0]
We present a new observation of gender bias in a downstream NLP application: marked attribute bias in natural language inference.
Bias in downstream applications can stem from training data, word embeddings, or be amplified by the model in use.
Here we seek to understand how the intrinsic properties of word embeddings contribute to this observed marked attribute effect.
arXiv Detail & Related papers (2021-09-28T20:45:02Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Understanding and Mitigating Annotation Bias in Facial Expression
Recognition [3.325054486984015]
Most existing works assume that human-generated annotations can be considered gold-standard and unbiased.
We focus on facial expression recognition and compare the label biases between lab-controlled and in-the-wild datasets.
We propose an AU-Calibrated Facial Expression Recognition framework that utilizes facial action units (AUs) and incorporates the triplet loss into the objective function.
arXiv Detail & Related papers (2021-08-19T05:28:07Z) - Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation [94.98656228690233]
We propose a technique that purifies the word embeddings against corpus regularities prior to inferring and removing the gender subspace.
Our approach preserves the distributional semantics of the pre-trained word embeddings while reducing gender bias to a significantly larger degree than prior approaches.
arXiv Detail & Related papers (2020-05-03T02:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.