Related papers: How Far Can It Go?: On Intrinsic Gender Bias Mitigation for Text Classification

How Far Can It Go?: On Intrinsic Gender Bias Mitigation for Text Classification

URL: http://arxiv.org/abs/2301.12855v1
Date: Mon, 30 Jan 2023 13:05:48 GMT
Title: How Far Can It Go?: On Intrinsic Gender Bias Mitigation for Text Classification
Authors: Ewoenam Tokpo, Pieter Delobelle, Bettina Berendt and Toon Calders
Abstract summary: We investigate the effects that some of the major intrinsic gender bias mitigation strategies have on downstream text classification tasks. We show that each mitigation technique is able to hide the bias from some of the intrinsic bias measures but not all. We recommend that intrinsic bias mitigation techniques should be combined with other fairness interventions for downstream tasks.
Score: 12.165921897192902
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To mitigate gender bias in contextualized language models, different intrinsic mitigation strategies have been proposed, alongside many bias metrics. Considering that the end use of these language models is for downstream tasks like text classification, it is important to understand how these intrinsic bias mitigation strategies actually translate to fairness in downstream tasks and the extent of this. In this work, we design a probe to investigate the effects that some of the major intrinsic gender bias mitigation strategies have on downstream text classification tasks. We discover that instead of resolving gender bias, intrinsic mitigation techniques and metrics are able to hide it in such a way that significant gender information is retained in the embeddings. Furthermore, we show that each mitigation technique is able to hide the bias from some of the intrinsic bias measures but not all, and each intrinsic bias measure can be fooled by some mitigation techniques, but not all. We confirm experimentally, that none of the intrinsic mitigation techniques used without any other fairness intervention is able to consistently impact extrinsic bias. We recommend that intrinsic bias mitigation techniques should be combined with other fairness interventions for downstream tasks.

Related papers

Mitigating Gender Bias in Contextual Word Embeddings [1.208453901299241]
We propose a novel objective function for Lipstick(Masked-Language Modeling) which largely mitigates the gender bias in contextual embeddings. We also propose new methods for debiasing static embeddings and provide empirical proof via extensive analysis and experiments.
arXiv Detail & Related papers (2024-11-18T21:36:44Z)
Gender Bias Mitigation for Bangla Classification Tasks [2.6285986998314783]
We investigate gender bias in Bangla pretrained language models. By altering names and gender-specific terms, we ensured these datasets were suitable for detecting and mitigating gender bias.
arXiv Detail & Related papers (2024-11-16T00:04:45Z)
Applying Intrinsic Debiasing on Downstream Tasks: Challenges and Considerations for Machine Translation [19.06428714669272]
We systematically test how methods for intrinsic debiasing affect neural machine translation models. We highlight three challenges and mismatches between the debiasing techniques and their end-goal usage.
arXiv Detail & Related papers (2024-06-02T15:57:29Z)
How to be fair? A study of label and selection bias [3.018638214344819]
It is widely accepted that biased data leads to biased and potentially unfair models. Several measures for bias in data and model predictions have been proposed, as well as bias mitigation techniques. Despite the myriad of mitigation techniques developed in the past decade, it is still poorly understood under what circumstances which methods work.
arXiv Detail & Related papers (2024-03-21T10:43:55Z)
Explaining Knock-on Effects of Bias Mitigation [13.46387356280467]
In machine learning systems, bias mitigation approaches aim to make outcomes fairer across privileged and unprivileged groups. In this paper, we aim to characterise impacted cohorts when mitigation interventions are applied. We examine a range of bias mitigation strategies that work at various stages of the model life cycle. We show that all tested mitigation strategies negatively impact a non-trivial fraction of cases, i.e., people who receive unfavourable outcomes solely on account of mitigation efforts.
arXiv Detail & Related papers (2023-12-01T18:40:37Z)
The Impact of Debiasing on the Performance of Language Models in Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets. Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z)
Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous Pronouns [53.62845317039185]
Bias-measuring datasets play a critical role in detecting biased behavior of language models. We propose a novel method to collect diverse, natural, and minimally distant text pairs via counterfactual generation. We show that four pre-trained language models are significantly more inconsistent across different gender groups than within each group.
arXiv Detail & Related papers (2023-02-11T12:11:03Z)
Information-Theoretic Bias Reduction via Causal View of Spurious Correlation [71.9123886505321]
We propose an information-theoretic bias measurement technique through a causal interpretation of spurious correlation. We present a novel debiasing framework against the algorithmic bias, which incorporates a bias regularization loss. The proposed bias measurement and debiasing approaches are validated in diverse realistic scenarios.
arXiv Detail & Related papers (2022-01-10T01:19:31Z)
Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race. Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables. This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z)
Towards causal benchmarking of bias in face analysis algorithms [54.19499274513654]
We develop an experimental method for measuring algorithmic bias of face analysis algorithms. Our proposed method is based on generating synthetic transects'' of matched sample images. We validate our method by comparing it to a study that employs the traditional observational method for analyzing bias in gender classification algorithms.
arXiv Detail & Related papers (2020-07-13T17:10:34Z)
Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation [94.98656228690233]
We propose a technique that purifies the word embeddings against corpus regularities prior to inferring and removing the gender subspace. Our approach preserves the distributional semantics of the pre-trained word embeddings while reducing gender bias to a significantly larger degree than prior approaches.
arXiv Detail & Related papers (2020-05-03T02:33:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.