Related papers: Modular and On-demand Bias Mitigation with Attribute-Removal Subnetworks

Modular and On-demand Bias Mitigation with Attribute-Removal Subnetworks

URL: http://arxiv.org/abs/2205.15171v5
Date: Sun, 4 Jun 2023 14:40:45 GMT
Title: Modular and On-demand Bias Mitigation with Attribute-Removal Subnetworks
Authors: Lukas Hauzenberger, Shahed Masoudian, Deepak Kumar, Markus Schedl, Navid Rekabsaz
Abstract summary: We propose a novel modular bias mitigation approach, consisting of stand-alone highly sparse debiasingworks. We conduct experiments on three classification tasks with gender, race, and age as protected attributes.
Score: 10.748627178113418
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Societal biases are reflected in large pre-trained language models and their fine-tuned versions on downstream tasks. Common in-processing bias mitigation approaches, such as adversarial training and mutual information removal, introduce additional optimization criteria, and update the model to reach a new debiased state. However, in practice, end-users and practitioners might prefer to switch back to the original model, or apply debiasing only on a specific subset of protected attributes. To enable this, we propose a novel modular bias mitigation approach, consisting of stand-alone highly sparse debiasing subnetworks, where each debiasing module can be integrated into the core model on-demand at inference time. Our approach draws from the concept of \emph{diff} pruning, and proposes a novel training regime adaptable to various representation disentanglement optimizations. We conduct experiments on three classification tasks with gender, race, and age as protected attributes. The results show that our modular approach, while maintaining task performance, improves (or at least remains on-par with) the effectiveness of bias mitigation in comparison with baseline finetuning. Particularly on a two-attribute dataset, our approach with separately learned debiasing subnetworks shows effective utilization of either or both the subnetworks for selective bias mitigation.

Related papers

Rethinking Relation Extraction: Beyond Shortcuts to Generalization with a Debiased Benchmark [53.876493664396506]
Benchmarks are crucial for evaluating machine learning algorithm performance, facilitating comparison and identifying superior solutions. This paper addresses the issue of entity bias in relation extraction tasks, where models tend to rely on entity mentions rather than context. We propose a debiased relation extraction benchmark DREB that breaks the pseudo-correlation between entity mentions and relation types through entity replacement. To establish a new baseline on DREB, we introduce MixDebias, a debiasing method combining data-level and model training-level techniques.
arXiv Detail & Related papers (2025-01-02T17:01:06Z)
CosFairNet:A Parameter-Space based Approach for Bias Free Learning [1.9116784879310025]
Deep neural networks trained on biased data often inadvertently learn unintended inference rules. We introduce a novel approach to address bias directly in the model's parameter space, preventing its propagation across layers. We show enhanced classification accuracy and debiasing effectiveness across various synthetic and real-world datasets.
arXiv Detail & Related papers (2024-10-19T13:06:40Z)
MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models. We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z)
Editable Fairness: Fine-Grained Bias Mitigation in Language Models [52.66450426729818]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases. FAST surpasses state-of-the-art baselines with superior debiasing performance. This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z)
Improving Bias Mitigation through Bias Experts in Natural Language Understanding [10.363406065066538]
We propose a new debiasing framework that introduces binary classifiers between the auxiliary model and the main model. Our proposed strategy improves the bias identification ability of the auxiliary model.
arXiv Detail & Related papers (2023-12-06T16:15:00Z)
Parameter-efficient Modularised Bias Mitigation via AdapterFusion [22.424110883305243]
We propose a novel approach to develop stand-alone debiasing functionalities separate from the model. We introduce DAM - a debiasing approach to first encapsulate arbitrary bias mitigation functionalities into separate adapters, and then add them to the model on-demand. Our results show that DAM improves or maintains the effectiveness of bias mitigation, avoids forgetting in a multi-attribute scenario, and maintains on-par task performance.
arXiv Detail & Related papers (2023-02-13T12:39:45Z)
Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding. We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z)
Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets. We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias. DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z)
General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space. GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z)
Learning Debiased Models with Dynamic Gradient Alignment and Bias-conflicting Sample Mining [39.00256193731365]
Deep neural networks notoriously suffer from dataset biases which are detrimental to model robustness, generalization and fairness. We propose a two-stage debiasing scheme to combat against the intractable unknown biases.
arXiv Detail & Related papers (2021-11-25T14:50:10Z)
Model-agnostic bias mitigation methods with regressor distribution control for Wasserstein-based fairness metrics [0.6509758931804478]
We propose a bias mitigation methodology based upon the construction of post-processed models with fairer regressor distributions. Our novel methodology performs optimization in low-dimensional spaces and avoids expensive model retraining.
arXiv Detail & Related papers (2021-11-19T17:31:22Z)
Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning [94.35586521144117]
We investigate whether applying contrastive learning to fine-tuning would bring further benefits. We propose Contrast-regularized tuning (Core-tuning), a novel approach for fine-tuning contrastive self-supervised visual models.
arXiv Detail & Related papers (2021-02-12T16:31:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.