Modular and On-demand Bias Mitigation with Attribute-Removal Subnetworks
- URL: http://arxiv.org/abs/2205.15171v5
- Date: Sun, 4 Jun 2023 14:40:45 GMT
- Title: Modular and On-demand Bias Mitigation with Attribute-Removal Subnetworks
- Authors: Lukas Hauzenberger, Shahed Masoudian, Deepak Kumar, Markus Schedl,
Navid Rekabsaz
- Abstract summary: We propose a novel modular bias mitigation approach, consisting of stand-alone highly sparse debiasingworks.
We conduct experiments on three classification tasks with gender, race, and age as protected attributes.
- Score: 10.748627178113418
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Societal biases are reflected in large pre-trained language models and their
fine-tuned versions on downstream tasks. Common in-processing bias mitigation
approaches, such as adversarial training and mutual information removal,
introduce additional optimization criteria, and update the model to reach a new
debiased state. However, in practice, end-users and practitioners might prefer
to switch back to the original model, or apply debiasing only on a specific
subset of protected attributes. To enable this, we propose a novel modular bias
mitigation approach, consisting of stand-alone highly sparse debiasing
subnetworks, where each debiasing module can be integrated into the core model
on-demand at inference time. Our approach draws from the concept of \emph{diff}
pruning, and proposes a novel training regime adaptable to various
representation disentanglement optimizations. We conduct experiments on three
classification tasks with gender, race, and age as protected attributes. The
results show that our modular approach, while maintaining task performance,
improves (or at least remains on-par with) the effectiveness of bias mitigation
in comparison with baseline finetuning. Particularly on a two-attribute
dataset, our approach with separately learned debiasing subnetworks shows
effective utilization of either or both the subnetworks for selective bias
mitigation.
Related papers
- CosFairNet:A Parameter-Space based Approach for Bias Free Learning [1.9116784879310025]
Deep neural networks trained on biased data often inadvertently learn unintended inference rules.
We introduce a novel approach to address bias directly in the model's parameter space, preventing its propagation across layers.
We show enhanced classification accuracy and debiasing effectiveness across various synthetic and real-world datasets.
arXiv Detail & Related papers (2024-10-19T13:06:40Z) - MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models.
We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z) - Editable Fairness: Fine-Grained Bias Mitigation in Language Models [52.66450426729818]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases.
FAST surpasses state-of-the-art baselines with superior debiasing performance.
This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z) - Improving Bias Mitigation through Bias Experts in Natural Language
Understanding [10.363406065066538]
We propose a new debiasing framework that introduces binary classifiers between the auxiliary model and the main model.
Our proposed strategy improves the bias identification ability of the auxiliary model.
arXiv Detail & Related papers (2023-12-06T16:15:00Z) - Parameter-efficient Modularised Bias Mitigation via AdapterFusion [22.424110883305243]
We propose a novel approach to develop stand-alone debiasing functionalities separate from the model.
We introduce DAM - a debiasing approach to first encapsulate arbitrary bias mitigation functionalities into separate adapters, and then add them to the model on-demand.
Our results show that DAM improves or maintains the effectiveness of bias mitigation, avoids forgetting in a multi-attribute scenario, and maintains on-par task performance.
arXiv Detail & Related papers (2023-02-13T12:39:45Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Learning Debiased Models with Dynamic Gradient Alignment and
Bias-conflicting Sample Mining [39.00256193731365]
Deep neural networks notoriously suffer from dataset biases which are detrimental to model robustness, generalization and fairness.
We propose a two-stage debiasing scheme to combat against the intractable unknown biases.
arXiv Detail & Related papers (2021-11-25T14:50:10Z) - Model-agnostic bias mitigation methods with regressor distribution
control for Wasserstein-based fairness metrics [0.6509758931804478]
We propose a bias mitigation methodology based upon the construction of post-processed models with fairer regressor distributions.
Our novel methodology performs optimization in low-dimensional spaces and avoids expensive model retraining.
arXiv Detail & Related papers (2021-11-19T17:31:22Z) - Unleashing the Power of Contrastive Self-Supervised Visual Models via
Contrast-Regularized Fine-Tuning [94.35586521144117]
We investigate whether applying contrastive learning to fine-tuning would bring further benefits.
We propose Contrast-regularized tuning (Core-tuning), a novel approach for fine-tuning contrastive self-supervised visual models.
arXiv Detail & Related papers (2021-02-12T16:31:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.