Does Debiasing Inevitably Degrade the Model Performance
- URL: http://arxiv.org/abs/2211.07350v2
- Date: Mon, 12 Jun 2023 13:26:12 GMT
- Title: Does Debiasing Inevitably Degrade the Model Performance
- Authors: Yiran Liu, Xiao Liu, Haotian Chen and Yang Yu
- Abstract summary: We propose a theoretical framework explaining the three candidate mechanisms of the language model's gender bias.
We also discover a pathway through which debiasing will not degrade the model performance.
- Score: 8.20550078248207
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gender bias in language models has attracted sufficient attention because it
threatens social justice. However, most of the current debiasing methods
degraded the model's performance on other tasks while the degradation mechanism
is still mysterious. We propose a theoretical framework explaining the three
candidate mechanisms of the language model's gender bias. We use our
theoretical framework to explain why the current debiasing methods cause
performance degradation. We also discover a pathway through which debiasing
will not degrade the model performance. We further develop a
causality-detection fine-tuning approach to correct gender bias. The numerical
experiment demonstrates that our method is able to lead to double dividends:
partially mitigating gender bias while avoiding performance degradation.
Related papers
- DiFair: A Benchmark for Disentangled Assessment of Gender Knowledge and
Bias [13.928591341824248]
Debiasing techniques have been proposed to mitigate the gender bias that is prevalent in pretrained language models.
These are often evaluated on datasets that check the extent to which the model is gender-neutral in its predictions.
This evaluation protocol overlooks the possible adverse impact of bias mitigation on useful gender knowledge.
arXiv Detail & Related papers (2023-10-22T15:27:16Z) - Detecting and Mitigating Algorithmic Bias in Binary Classification using
Causal Modeling [0.0]
We show that gender bias in the prediction model is statistically significant at the 0.05 level.
We demonstrate the effectiveness of the causal model in mitigating gender bias by cross-validation.
Our novel approach is intuitive, easy-to-use, and can be implemented using existing statistical software tools such as "lavaan" in R.
arXiv Detail & Related papers (2023-10-19T02:21:04Z) - The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z) - Language Models Get a Gender Makeover: Mitigating Gender Bias with
Few-Shot Data Interventions [50.67412723291881]
Societal biases present in pre-trained large language models are a critical issue.
We propose data intervention strategies as a powerful yet simple technique to reduce gender bias in pre-trained models.
arXiv Detail & Related papers (2023-06-07T16:50:03Z) - The Birth of Bias: A case study on the evolution of gender bias in an
English language model [1.6344851071810076]
We use a relatively small language model, using the LSTM architecture trained on an English Wikipedia corpus.
We find that the representation of gender is dynamic and identify different phases during training.
We show that gender information is represented increasingly locally in the input embeddings of the model.
arXiv Detail & Related papers (2022-07-21T00:59:04Z) - Gender Biases and Where to Find Them: Exploring Gender Bias in
Pre-Trained Transformer-based Language Models Using Movement Pruning [32.62430731115707]
We show a novel framework for inspecting bias in transformer-based language models via movement pruning.
We implement our framework by pruning the model while fine-tuning it on the debiasing objective.
We re-discover a bias-performance trade-off: the better the model performs, the more bias it contains.
arXiv Detail & Related papers (2022-07-06T06:20:35Z) - Mitigating Gender Bias in Distilled Language Models via Counterfactual
Role Reversal [74.52580517012832]
Language excel models can be biased in ways including male and female knowledge with genderneutral genders.
We present a novel approach to mitigate gender disparity based on multiple learning role settings.
We observe that models that reduce gender polarity language do not improve fairness or downstream classification.
arXiv Detail & Related papers (2022-03-23T17:34:35Z) - Improving Gender Fairness of Pre-Trained Language Models without
Catastrophic Forgetting [88.83117372793737]
Forgetting information in the original training data may damage the model's downstream performance by a large margin.
We propose GEnder Equality Prompt (GEEP) to improve gender fairness of pre-trained models with less forgetting.
arXiv Detail & Related papers (2021-10-11T15:52:16Z) - Stereotype and Skew: Quantifying Gender Bias in Pre-trained and
Fine-tuned Language Models [5.378664454650768]
This paper proposes two intuitive metrics, skew and stereotype, that quantify and analyse the gender bias present in contextual language models.
We find evidence that gender stereotype correlates approximately negatively with gender skew in out-of-the-box models, suggesting that there is a trade-off between these two forms of bias.
arXiv Detail & Related papers (2021-01-24T10:57:59Z) - Mitigating Gender Bias in Captioning Systems [56.25457065032423]
Most captioning models learn gender bias, leading to high gender prediction errors, especially for women.
We propose a new Guided Attention Image Captioning model (GAIC) which provides self-guidance on visual attention to encourage the model to capture correct gender visual evidence.
arXiv Detail & Related papers (2020-06-15T12:16:19Z) - Mitigating Gender Bias Amplification in Distribution by Posterior
Regularization [75.3529537096899]
We investigate the gender bias amplification issue from the distribution perspective.
We propose a bias mitigation approach based on posterior regularization.
Our study sheds the light on understanding the bias amplification.
arXiv Detail & Related papers (2020-05-13T11:07:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.