Word Embeddings via Causal Inference: Gender Bias Reducing and Semantic
Information Preserving
- URL: http://arxiv.org/abs/2112.05194v1
- Date: Thu, 9 Dec 2021 19:57:22 GMT
- Title: Word Embeddings via Causal Inference: Gender Bias Reducing and Semantic
Information Preserving
- Authors: Lei Ding, Dengdeng Yu, Jinhan Xie, Wenxing Guo, Shenggang Hu, Meichen
Liu, Linglong Kong, Hongsheng Dai, Yanchun Bao, Bei Jiang
- Abstract summary: We propose a novel methodology that leverages a causal inference framework to effectively remove gender bias.
Our comprehensive experiments show that the proposed method achieves state-of-the-art results in gender-debiasing tasks.
- Score: 3.114945725130788
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With widening deployments of natural language processing (NLP) in daily life,
inherited social biases from NLP models have become more severe and
problematic. Previous studies have shown that word embeddings trained on
human-generated corpora have strong gender biases that can produce
discriminative results in downstream tasks. Previous debiasing methods focus
mainly on modeling bias and only implicitly consider semantic information while
completely overlooking the complex underlying causal structure among bias and
semantic components. To address these issues, we propose a novel methodology
that leverages a causal inference framework to effectively remove gender bias.
The proposed method allows us to construct and analyze the complex causal
mechanisms facilitating gender information flow while retaining oracle semantic
information within word embeddings. Our comprehensive experiments show that the
proposed method achieves state-of-the-art results in gender-debiasing tasks. In
addition, our methods yield better performance in word similarity evaluation
and various extrinsic downstream NLP tasks.
Related papers
- The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models [58.130894823145205]
We center transgender, nonbinary, and other gender-diverse identities to investigate how alignment procedures interact with pre-existing gender-diverse bias.
Our findings reveal that DPO-aligned models are particularly sensitive to supervised finetuning.
We conclude with recommendations tailored to DPO and broader alignment practices.
arXiv Detail & Related papers (2024-11-06T06:50:50Z) - Locating and Mitigating Gender Bias in Large Language Models [40.78150878350479]
Large language models (LLM) are pre-trained on extensive corpora to learn facts and human cognition which contain human preferences.
This process can inadvertently lead to these models acquiring biases and prevalent stereotypes in society.
We propose the LSDM (Least Square Debias Method), a knowledge-editing based method for mitigating gender bias in occupational pronouns.
arXiv Detail & Related papers (2024-03-21T13:57:43Z) - The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z) - Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous
Pronouns [53.62845317039185]
Bias-measuring datasets play a critical role in detecting biased behavior of language models.
We propose a novel method to collect diverse, natural, and minimally distant text pairs via counterfactual generation.
We show that four pre-trained language models are significantly more inconsistent across different gender groups than within each group.
arXiv Detail & Related papers (2023-02-11T12:11:03Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Interpretable bias mitigation for textual data: Reducing gender bias in
patient notes while maintaining classification performance [0.11545092788508224]
We identify and remove gendered language from two clinical-note datasets.
We show minimal degradation in health condition classification tasks for low- to medium-levels of bias removal via data augmentation.
This work outlines an interpretable approach for using data augmentation to identify and reduce the potential for bias in natural language processing pipelines.
arXiv Detail & Related papers (2021-03-10T03:09:30Z) - Fair Embedding Engine: A Library for Analyzing and Mitigating Gender
Bias in Word Embeddings [16.49645205111334]
Non-contextual word embedding models have been shown to inherit human-like stereotypical biases of gender, race and religion from the training corpora.
This paper describes Fair Embedding Engine (FEE), a library for analysing and mitigating gender bias in word embeddings.
arXiv Detail & Related papers (2020-10-25T17:31:12Z) - Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation [94.98656228690233]
We propose a technique that purifies the word embeddings against corpus regularities prior to inferring and removing the gender subspace.
Our approach preserves the distributional semantics of the pre-trained word embeddings while reducing gender bias to a significantly larger degree than prior approaches.
arXiv Detail & Related papers (2020-05-03T02:33:20Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z) - Neutralizing Gender Bias in Word Embedding with Latent Disentanglement
and Counterfactual Generation [25.060917870666803]
We introduce a siamese auto-encoder structure with an adapted gradient reversal layer.
Our structure enables the separation of the semantic latent information and gender latent information of given word into the disjoint latent dimensions.
arXiv Detail & Related papers (2020-04-07T05:16:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.