Related papers: Model-based Counterfactual Generator for Gender Bias Mitigation

Model-based Counterfactual Generator for Gender Bias Mitigation

URL: http://arxiv.org/abs/2311.03186v1
Date: Mon, 6 Nov 2023 15:25:30 GMT
Title: Model-based Counterfactual Generator for Gender Bias Mitigation
Authors: Ewoenam Kwaku Tokpo, Toon Calders
Abstract summary: Counterfactual Data Augmentation has been one of the preferred techniques for mitigating gender bias in natural language models. We highlight some limitations of dictionary-based counterfactual data augmentation techniques. We propose a model-based solution for generating counterfactuals to mitigate gender bias.
Score: 8.75682288556859
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Counterfactual Data Augmentation (CDA) has been one of the preferred techniques for mitigating gender bias in natural language models. CDA techniques have mostly employed word substitution based on dictionaries. Although such dictionary-based CDA techniques have been shown to significantly improve the mitigation of gender bias, in this paper, we highlight some limitations of such dictionary-based counterfactual data augmentation techniques, such as susceptibility to ungrammatical compositions, and lack of generalization outside the set of predefined dictionary words. Model-based solutions can alleviate these problems, yet the lack of qualitative parallel training data hinders development in this direction. Therefore, we propose a combination of data processing techniques and a bi-objective training regime to develop a model-based solution for generating counterfactuals to mitigate gender bias. We implemented our proposed solution and performed an empirical evaluation which shows how our model alleviates the shortcomings of dictionary-based solutions.

Related papers

Gender Bias Mitigation for Bangla Classification Tasks [2.6285986998314783]
We investigate gender bias in Bangla pretrained language models. By altering names and gender-specific terms, we ensured these datasets were suitable for detecting and mitigating gender bias.
arXiv Detail & Related papers (2024-11-16T00:04:45Z)
FairFlow: An Automated Approach to Model-based Counterfactual Data Augmentation For NLP [7.41244589428771]
This paper proposes FairFlow, an automated approach to generating parallel data for training counterfactual text generator models. We show that FairFlow significantly overcomes the limitations of dictionary-based word-substitution approaches whilst maintaining good performance.
arXiv Detail & Related papers (2024-07-23T12:29:37Z)
GECOBench: A Gender-Controlled Text Dataset and Benchmark for Quantifying Biases in Explanations [1.0000511213628438]
We create a gender-controlled text dataset, GECO, in which otherwise identical sentences appear in male and female forms. This gives rise to ground-truth 'world explanations' for gender classification tasks. We also provide GECOBench, a rigorous quantitative evaluation framework benchmarking popular XAI methods.
arXiv Detail & Related papers (2024-06-17T13:44:37Z)
Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions [50.67412723291881]
Societal biases present in pre-trained large language models are a critical issue. We propose data intervention strategies as a powerful yet simple technique to reduce gender bias in pre-trained models.
arXiv Detail & Related papers (2023-06-07T16:50:03Z)
Enhancing Text Generation with Cooperative Training [23.971227375706327]
Most prevailing methods trained generative and discriminative models in isolation, which left them unable to adapt to changes in each other. We introduce a textitself-consistent learning framework in the text field that involves training a discriminator and generator cooperatively in a closed-loop manner. Our framework are able to mitigate training instabilities such as mode collapse and non-convergence.
arXiv Detail & Related papers (2023-03-16T04:21:19Z)
Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous Pronouns [53.62845317039185]
Bias-measuring datasets play a critical role in detecting biased behavior of language models. We propose a novel method to collect diverse, natural, and minimally distant text pairs via counterfactual generation. We show that four pre-trained language models are significantly more inconsistent across different gender groups than within each group.
arXiv Detail & Related papers (2023-02-11T12:11:03Z)
Improving Pre-trained Language Model Fine-tuning with Noise Stability Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR) Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model. We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z)
Word Embeddings via Causal Inference: Gender Bias Reducing and Semantic Information Preserving [3.114945725130788]
We propose a novel methodology that leverages a causal inference framework to effectively remove gender bias. Our comprehensive experiments show that the proposed method achieves state-of-the-art results in gender-debiasing tasks.
arXiv Detail & Related papers (2021-12-09T19:57:22Z)
NoiER: An Approach for Training more Reliable Fine-TunedDownstream Task Models [54.184609286094044]
We propose noise entropy regularisation (NoiER) as an efficient learning paradigm that solves the problem without auxiliary models and additional data. The proposed approach improved traditional OOD detection evaluation metrics by 55% on average compared to the original fine-tuned models.
arXiv Detail & Related papers (2021-08-29T06:58:28Z)
Learning to Perturb Word Embeddings for Out-of-distribution QA [55.103586220757464]
We propose a simple yet effective DA method based on a noise generator, which learns to perturb the word embedding of the input questions and context without changing their semantics. We validate the performance of the QA models trained with our word embedding on a single source dataset, on five different target domains. Notably, the model trained with ours outperforms the model trained with more than 240K artificially generated QA pairs.
arXiv Detail & Related papers (2021-05-06T14:12:26Z)
The Gap on GAP: Tackling the Problem of Differing Data Distributions in Bias-Measuring Datasets [58.53269361115974]
Diagnostic datasets that can detect biased models are an important prerequisite for bias reduction within natural language processing. undesired patterns in the collected data can make such tests incorrect. We introduce a theoretically grounded method for weighting test samples to cope with such patterns in the test data.
arXiv Detail & Related papers (2020-11-03T16:50:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.