Model-based Counterfactual Generator for Gender Bias Mitigation
- URL: http://arxiv.org/abs/2311.03186v1
- Date: Mon, 6 Nov 2023 15:25:30 GMT
- Title: Model-based Counterfactual Generator for Gender Bias Mitigation
- Authors: Ewoenam Kwaku Tokpo, Toon Calders
- Abstract summary: Counterfactual Data Augmentation has been one of the preferred techniques for mitigating gender bias in natural language models.
We highlight some limitations of dictionary-based counterfactual data augmentation techniques.
We propose a model-based solution for generating counterfactuals to mitigate gender bias.
- Score: 8.75682288556859
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Counterfactual Data Augmentation (CDA) has been one of the preferred
techniques for mitigating gender bias in natural language models. CDA
techniques have mostly employed word substitution based on dictionaries.
Although such dictionary-based CDA techniques have been shown to significantly
improve the mitigation of gender bias, in this paper, we highlight some
limitations of such dictionary-based counterfactual data augmentation
techniques, such as susceptibility to ungrammatical compositions, and lack of
generalization outside the set of predefined dictionary words. Model-based
solutions can alleviate these problems, yet the lack of qualitative parallel
training data hinders development in this direction. Therefore, we propose a
combination of data processing techniques and a bi-objective training regime to
develop a model-based solution for generating counterfactuals to mitigate
gender bias. We implemented our proposed solution and performed an empirical
evaluation which shows how our model alleviates the shortcomings of
dictionary-based solutions.
Related papers
- Gender Bias Mitigation for Bangla Classification Tasks [2.6285986998314783]
We investigate gender bias in Bangla pretrained language models.
By altering names and gender-specific terms, we ensured these datasets were suitable for detecting and mitigating gender bias.
arXiv Detail & Related papers (2024-11-16T00:04:45Z) - FairFlow: An Automated Approach to Model-based Counterfactual Data Augmentation For NLP [7.41244589428771]
This paper proposes FairFlow, an automated approach to generating parallel data for training counterfactual text generator models.
We show that FairFlow significantly overcomes the limitations of dictionary-based word-substitution approaches whilst maintaining good performance.
arXiv Detail & Related papers (2024-07-23T12:29:37Z) - GECOBench: A Gender-Controlled Text Dataset and Benchmark for Quantifying Biases in Explanations [1.0000511213628438]
We create a gender-controlled text dataset, GECO, in which otherwise identical sentences appear in male and female forms.
This gives rise to ground-truth 'world explanations' for gender classification tasks.
We also provide GECOBench, a rigorous quantitative evaluation framework benchmarking popular XAI methods.
arXiv Detail & Related papers (2024-06-17T13:44:37Z) - Language Models Get a Gender Makeover: Mitigating Gender Bias with
Few-Shot Data Interventions [50.67412723291881]
Societal biases present in pre-trained large language models are a critical issue.
We propose data intervention strategies as a powerful yet simple technique to reduce gender bias in pre-trained models.
arXiv Detail & Related papers (2023-06-07T16:50:03Z) - Enhancing Text Generation with Cooperative Training [23.971227375706327]
Most prevailing methods trained generative and discriminative models in isolation, which left them unable to adapt to changes in each other.
We introduce a textitself-consistent learning framework in the text field that involves training a discriminator and generator cooperatively in a closed-loop manner.
Our framework are able to mitigate training instabilities such as mode collapse and non-convergence.
arXiv Detail & Related papers (2023-03-16T04:21:19Z) - Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous
Pronouns [53.62845317039185]
Bias-measuring datasets play a critical role in detecting biased behavior of language models.
We propose a novel method to collect diverse, natural, and minimally distant text pairs via counterfactual generation.
We show that four pre-trained language models are significantly more inconsistent across different gender groups than within each group.
arXiv Detail & Related papers (2023-02-11T12:11:03Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z) - Word Embeddings via Causal Inference: Gender Bias Reducing and Semantic
Information Preserving [3.114945725130788]
We propose a novel methodology that leverages a causal inference framework to effectively remove gender bias.
Our comprehensive experiments show that the proposed method achieves state-of-the-art results in gender-debiasing tasks.
arXiv Detail & Related papers (2021-12-09T19:57:22Z) - NoiER: An Approach for Training more Reliable Fine-TunedDownstream Task
Models [54.184609286094044]
We propose noise entropy regularisation (NoiER) as an efficient learning paradigm that solves the problem without auxiliary models and additional data.
The proposed approach improved traditional OOD detection evaluation metrics by 55% on average compared to the original fine-tuned models.
arXiv Detail & Related papers (2021-08-29T06:58:28Z) - Learning to Perturb Word Embeddings for Out-of-distribution QA [55.103586220757464]
We propose a simple yet effective DA method based on a noise generator, which learns to perturb the word embedding of the input questions and context without changing their semantics.
We validate the performance of the QA models trained with our word embedding on a single source dataset, on five different target domains.
Notably, the model trained with ours outperforms the model trained with more than 240K artificially generated QA pairs.
arXiv Detail & Related papers (2021-05-06T14:12:26Z) - The Gap on GAP: Tackling the Problem of Differing Data Distributions in
Bias-Measuring Datasets [58.53269361115974]
Diagnostic datasets that can detect biased models are an important prerequisite for bias reduction within natural language processing.
undesired patterns in the collected data can make such tests incorrect.
We introduce a theoretically grounded method for weighting test samples to cope with such patterns in the test data.
arXiv Detail & Related papers (2020-11-03T16:50:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.