Parameter-efficient Modularised Bias Mitigation via AdapterFusion
- URL: http://arxiv.org/abs/2302.06321v2
- Date: Sun, 18 Jun 2023 20:14:15 GMT
- Title: Parameter-efficient Modularised Bias Mitigation via AdapterFusion
- Authors: Deepak Kumar, Oleg Lesota, George Zerveas, Daniel Cohen, Carsten
Eickhoff, Markus Schedl, Navid Rekabsaz
- Abstract summary: We propose a novel approach to develop stand-alone debiasing functionalities separate from the model.
We introduce DAM - a debiasing approach to first encapsulate arbitrary bias mitigation functionalities into separate adapters, and then add them to the model on-demand.
Our results show that DAM improves or maintains the effectiveness of bias mitigation, avoids forgetting in a multi-attribute scenario, and maintains on-par task performance.
- Score: 22.424110883305243
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large pre-trained language models contain societal biases and carry along
these biases to downstream tasks. Current in-processing bias mitigation
approaches (like adversarial training) impose debiasing by updating a model's
parameters, effectively transferring the model to a new, irreversible debiased
state. In this work, we propose a novel approach to develop stand-alone
debiasing functionalities separate from the model, which can be integrated into
the model on-demand, while keeping the core model untouched. Drawing from the
concept of AdapterFusion in multi-task learning, we introduce DAM (Debiasing
with Adapter Modules) - a debiasing approach to first encapsulate arbitrary
bias mitigation functionalities into separate adapters, and then add them to
the model on-demand in order to deliver fairness qualities. We conduct a large
set of experiments on three classification tasks with gender, race, and age as
protected attributes. Our results show that DAM improves or maintains the
effectiveness of bias mitigation, avoids catastrophic forgetting in a
multi-attribute scenario, and maintains on-par task performance, while granting
parameter-efficiency and easy switching between the original and debiased
models.
Related papers
- MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models.
We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z) - SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - CONTRAST: Continual Multi-source Adaptation to Dynamic Distributions [42.293444710522294]
Continual Multi-source Adaptation to Dynamic Distributions (CONTRAST) is a novel method that optimally combines multiple source models to adapt to the dynamic test data.
We show that the proposed method is able to optimally combine the source models and prioritize updates to the model least prone to forgetting.
arXiv Detail & Related papers (2024-01-04T22:23:56Z) - Improving Bias Mitigation through Bias Experts in Natural Language
Understanding [10.363406065066538]
We propose a new debiasing framework that introduces binary classifiers between the auxiliary model and the main model.
Our proposed strategy improves the bias identification ability of the auxiliary model.
arXiv Detail & Related papers (2023-12-06T16:15:00Z) - Debiasing Algorithm through Model Adaptation [5.482673673984126]
We perform causal analysis to identify problematic model components and discover that mid-upper feed-forward layers are most prone to convey bias.
Based on the analysis results, we intervene in the model by applying a linear projection to the weight matrices of these layers.
Our titular method, DAMA, significantly decreases bias as measured by diverse metrics while maintaining the model's performance on downstream tasks.
arXiv Detail & Related papers (2023-10-29T05:50:03Z) - Fast Model Debias with Machine Unlearning [54.32026474971696]
Deep neural networks might behave in a biased manner in many real-world scenarios.
Existing debiasing methods suffer from high costs in bias labeling or model re-training.
We propose a fast model debiasing framework (FMD) which offers an efficient approach to identify, evaluate and remove biases.
arXiv Detail & Related papers (2023-10-19T08:10:57Z) - An Empirical Analysis of Parameter-Efficient Methods for Debiasing
Pre-Trained Language Models [55.14405248920852]
We conduct experiments with prefix tuning, prompt tuning, and adapter tuning on different language models and bias types to evaluate their debiasing performance.
We find that the parameter-efficient methods are effective in mitigating gender bias, where adapter tuning is consistently the most effective.
We also find that prompt tuning is more suitable for GPT-2 than BERT, and racial and religious bias is less effective when it comes to racial and religious bias.
arXiv Detail & Related papers (2023-06-06T23:56:18Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Modular and On-demand Bias Mitigation with Attribute-Removal Subnetworks [10.748627178113418]
We propose a novel modular bias mitigation approach, consisting of stand-alone highly sparse debiasingworks.
We conduct experiments on three classification tasks with gender, race, and age as protected attributes.
arXiv Detail & Related papers (2022-05-30T15:21:25Z) - Learning from others' mistakes: Avoiding dataset biases without modeling
them [111.17078939377313]
State-of-the-art natural language processing (NLP) models often learn to model dataset biases and surface form correlations instead of features that target the intended task.
Previous work has demonstrated effective methods to circumvent these issues when knowledge of the bias is available.
We show a method for training models that learn to ignore these problematic correlations.
arXiv Detail & Related papers (2020-12-02T16:10:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.