Marked Attribute Bias in Natural Language Inference
- URL: http://arxiv.org/abs/2109.14039v1
- Date: Tue, 28 Sep 2021 20:45:02 GMT
- Title: Marked Attribute Bias in Natural Language Inference
- Authors: Hillary Dawkins
- Abstract summary: We present a new observation of gender bias in a downstream NLP application: marked attribute bias in natural language inference.
Bias in downstream applications can stem from training data, word embeddings, or be amplified by the model in use.
Here we seek to understand how the intrinsic properties of word embeddings contribute to this observed marked attribute effect.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reporting and providing test sets for harmful bias in NLP applications is
essential for building a robust understanding of the current problem. We
present a new observation of gender bias in a downstream NLP application:
marked attribute bias in natural language inference. Bias in downstream
applications can stem from training data, word embeddings, or be amplified by
the model in use. However, focusing on biased word embeddings is potentially
the most impactful first step due to their universal nature. Here we seek to
understand how the intrinsic properties of word embeddings contribute to this
observed marked attribute effect, and whether current post-processing methods
address the bias successfully. An investigation of the current debiasing
landscape reveals two open problems: none of the current debiased embeddings
mitigate the marked attribute error, and none of the intrinsic bias measures
are predictive of the marked attribute effect. By noticing that a new type of
intrinsic bias measure correlates meaningfully with the marked attribute
effect, we propose a new postprocessing debiasing scheme for static word
embeddings. The proposed method applied to existing embeddings achieves new
best results on the marked attribute bias test set. See
https://github.com/hillary-dawkins/MAB.
Related papers
- Unlabeled Debiasing in Downstream Tasks via Class-wise Low Variance Regularization [13.773597081543185]
We introduce a novel debiasing regularization technique based on the class-wise variance of embeddings.
Our method does not require attribute labels and targets any attribute, thus addressing the shortcomings of existing debiasing methods.
arXiv Detail & Related papers (2024-09-29T03:56:50Z) - Enhancing Intrinsic Features for Debiasing via Investigating Class-Discerning Common Attributes in Bias-Contrastive Pair [36.221761997349795]
Deep neural networks rely on bias attributes that are spuriously correlated with a target class in the presence of dataset bias.
This paper proposes a method that provides the model with explicit spatial guidance that indicates the region of intrinsic features.
Experiments demonstrate that our method achieves state-of-the-art performance on synthetic and real-world datasets with various levels of bias severity.
arXiv Detail & Related papers (2024-04-30T04:13:14Z) - Projective Methods for Mitigating Gender Bias in Pre-trained Language Models [10.418595661963062]
Projective methods are fast to implement, use a small number of saved parameters, and make no updates to the existing model parameters.
We find that projective methods can be effective at both intrinsic bias and downstream bias mitigation, but that the two outcomes are not necessarily correlated.
arXiv Detail & Related papers (2024-03-27T17:49:31Z) - Debiasing Sentence Embedders through Contrastive Word Pairs [46.9044612783003]
We explore an approach to remove linear and nonlinear bias information for NLP solutions.
We compare our approach to common debiasing methods on classical bias metrics and on bias metrics which take nonlinear information into account.
arXiv Detail & Related papers (2024-03-27T13:34:59Z) - Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction [56.17020601803071]
Recent research shows that pre-trained language models (PLMs) suffer from "prompt bias" in factual knowledge extraction.
This paper aims to improve the reliability of existing benchmarks by thoroughly investigating and mitigating prompt bias.
arXiv Detail & Related papers (2024-03-15T02:04:35Z) - SABAF: Removing Strong Attribute Bias from Neural Networks with
Adversarial Filtering [20.7209867191915]
We propose a new method for removing attribute bias in neural networks.
The proposed method achieves state-of-the-art performance in both strong and moderate bias settings.
arXiv Detail & Related papers (2023-11-13T08:13:55Z) - Causality and Independence Enhancement for Biased Node Classification [56.38828085943763]
We propose a novel Causality and Independence Enhancement (CIE) framework, applicable to various graph neural networks (GNNs)
Our approach estimates causal and spurious features at the node representation level and mitigates the influence of spurious correlations.
Our approach CIE not only significantly enhances the performance of GNNs but outperforms state-of-the-art debiased node classification methods.
arXiv Detail & Related papers (2023-10-14T13:56:24Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - Looking at the Overlooked: An Analysis on the Word-Overlap Bias in
Natural Language Inference [20.112129592923246]
We focus on an overlooked aspect of the overlap bias in NLI models: the reverse word-overlap bias.
Current NLI models are highly biased towards the non-entailment label on instances with low overlap.
We investigate the reasons for the emergence of the overlap bias and the role of minority examples in its mitigation.
arXiv Detail & Related papers (2022-11-07T21:02:23Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - OSCaR: Orthogonal Subspace Correction and Rectification of Biases in
Word Embeddings [47.721931801603105]
We propose OSCaR, a bias-mitigating method that focuses on disentangling biased associations between concepts instead of removing concepts wholesale.
Our experiments on gender biases show that OSCaR is a well-balanced approach that ensures that semantic information is retained in the embeddings and bias is also effectively mitigated.
arXiv Detail & Related papers (2020-06-30T18:18:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.