Modular and On-demand Bias Mitigation with Attribute-Removal Subnetworks
- URL: http://arxiv.org/abs/2205.15171v5
- Date: Sun, 4 Jun 2023 14:40:45 GMT
- Title: Modular and On-demand Bias Mitigation with Attribute-Removal Subnetworks
- Authors: Lukas Hauzenberger, Shahed Masoudian, Deepak Kumar, Markus Schedl,
Navid Rekabsaz
- Abstract summary: We propose a novel modular bias mitigation approach, consisting of stand-alone highly sparse debiasingworks.
We conduct experiments on three classification tasks with gender, race, and age as protected attributes.
- Score: 10.748627178113418
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Societal biases are reflected in large pre-trained language models and their
fine-tuned versions on downstream tasks. Common in-processing bias mitigation
approaches, such as adversarial training and mutual information removal,
introduce additional optimization criteria, and update the model to reach a new
debiased state. However, in practice, end-users and practitioners might prefer
to switch back to the original model, or apply debiasing only on a specific
subset of protected attributes. To enable this, we propose a novel modular bias
mitigation approach, consisting of stand-alone highly sparse debiasing
subnetworks, where each debiasing module can be integrated into the core model
on-demand at inference time. Our approach draws from the concept of \emph{diff}
pruning, and proposes a novel training regime adaptable to various
representation disentanglement optimizations. We conduct experiments on three
classification tasks with gender, race, and age as protected attributes. The
results show that our modular approach, while maintaining task performance,
improves (or at least remains on-par with) the effectiveness of bias mitigation
in comparison with baseline finetuning. Particularly on a two-attribute
dataset, our approach with separately learned debiasing subnetworks shows
effective utilization of either or both the subnetworks for selective bias
mitigation.
Related papers
- Diffusing DeBias: a Recipe for Turning a Bug into a Feature [15.214861534330236]
This paper presents Diffusing DeBias (DDB), a novel approach acting as a plug-in for common methods in model debiasing.
Our approach leverages conditional diffusion models to generate synthetic bias-aligned images, used to train a bias amplifier model.
Our proposed method beats current state-of-the-art in multiple benchmark datasets by significant margins.
arXiv Detail & Related papers (2025-02-13T18:17:03Z) - Rethinking Relation Extraction: Beyond Shortcuts to Generalization with a Debiased Benchmark [53.876493664396506]
Benchmarks are crucial for evaluating machine learning algorithm performance, facilitating comparison and identifying superior solutions.
This paper addresses the issue of entity bias in relation extraction tasks, where models tend to rely on entity mentions rather than context.
We propose a debiased relation extraction benchmark DREB that breaks the pseudo-correlation between entity mentions and relation types through entity replacement.
To establish a new baseline on DREB, we introduce MixDebias, a debiasing method combining data-level and model training-level techniques.
arXiv Detail & Related papers (2025-01-02T17:01:06Z) - Editable Fairness: Fine-Grained Bias Mitigation in Language Models [52.66450426729818]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases.
FAST surpasses state-of-the-art baselines with superior debiasing performance.
This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z) - Improving Bias Mitigation through Bias Experts in Natural Language
Understanding [10.363406065066538]
We propose a new debiasing framework that introduces binary classifiers between the auxiliary model and the main model.
Our proposed strategy improves the bias identification ability of the auxiliary model.
arXiv Detail & Related papers (2023-12-06T16:15:00Z) - Parameter-efficient Modularised Bias Mitigation via AdapterFusion [22.424110883305243]
We propose a novel approach to develop stand-alone debiasing functionalities separate from the model.
We introduce DAM - a debiasing approach to first encapsulate arbitrary bias mitigation functionalities into separate adapters, and then add them to the model on-demand.
Our results show that DAM improves or maintains the effectiveness of bias mitigation, avoids forgetting in a multi-attribute scenario, and maintains on-par task performance.
arXiv Detail & Related papers (2023-02-13T12:39:45Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Model-agnostic bias mitigation methods with regressor distribution
control for Wasserstein-based fairness metrics [0.6509758931804478]
We propose a bias mitigation methodology based upon the construction of post-processed models with fairer regressor distributions.
Our novel methodology performs optimization in low-dimensional spaces and avoids expensive model retraining.
arXiv Detail & Related papers (2021-11-19T17:31:22Z) - Unleashing the Power of Contrastive Self-Supervised Visual Models via
Contrast-Regularized Fine-Tuning [94.35586521144117]
We investigate whether applying contrastive learning to fine-tuning would bring further benefits.
We propose Contrast-regularized tuning (Core-tuning), a novel approach for fine-tuning contrastive self-supervised visual models.
arXiv Detail & Related papers (2021-02-12T16:31:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.