Related papers: PEFTDebias : Capturing debiasing information using PEFTs

PEFTDebias : Capturing debiasing information using PEFTs

URL: http://arxiv.org/abs/2312.00434v1
Date: Fri, 1 Dec 2023 09:06:06 GMT
Title: PEFTDebias : Capturing debiasing information using PEFTs
Authors: Sumit Agarwal, Aditya Srikanth Veerubhotla, Srijan Bansal
Abstract summary: We introduce PEFTDebias, a novel approach that employs parameter-efficient fine-tuning (PEFT) to mitigate the biases within foundation models. PEFTDebias consists of two main phases: an upstream phase for acquiring debiasing parameters along a specific bias axis, and a downstream phase where these parameters are incorporated into the model and frozen during the fine-tuning process.
Score: 3.6985496077087743
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The increasing use of foundation models highlights the urgent need to address and eliminate implicit biases present in them that arise during pretraining. In this paper, we introduce PEFTDebias, a novel approach that employs parameter-efficient fine-tuning (PEFT) to mitigate the biases within foundation models. PEFTDebias consists of two main phases: an upstream phase for acquiring debiasing parameters along a specific bias axis, and a downstream phase where these parameters are incorporated into the model and frozen during the fine-tuning process. By evaluating on four datasets across two bias axes namely gender and race, we find that downstream biases can be effectively reduced with PEFTs. In addition, we show that these parameters possess axis-specific debiasing characteristics, enabling their effective transferability in mitigating biases in various downstream tasks. To ensure reproducibility, we release the code to do our experiments.

Related papers

BEFT: Bias-Efficient Fine-Tuning of Language Models [13.498794394831604]
We propose an approach for selecting the bias term to be fine-tuned, forming the foundation of our bias-efficient fine-tuning (BEFT)<n>Our results demonstrate the effectiveness and superiority of our bias-efficient approach on diverse downstream tasks.
arXiv Detail & Related papers (2025-09-19T13:35:07Z)
How far can bias go? -- Tracing bias from pretraining data to alignment [54.51310112013655]
This study examines the correlation between gender-occupation bias in pre-training data and their manifestation in LLMs. Our findings reveal that biases present in pre-training data are amplified in model outputs.
arXiv Detail & Related papers (2024-11-28T16:20:25Z)
Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness [10.081447621656523]
The impact on language modeling ability can be alleviated given a high-quality and long-contextualized debiasing corpus. The effectiveness of task-agnostic debiasing hinges on the quantitative bias level of both the task-specific data used for downstream applications and the debiased model. We propose a novel framework which can Propagate Socially-fair Debiasing to Downstream Fine-tuning, ProSocialTuning.
arXiv Detail & Related papers (2024-06-06T15:11:11Z)
Projective Methods for Mitigating Gender Bias in Pre-trained Language Models [10.418595661963062]
Projective methods are fast to implement, use a small number of saved parameters, and make no updates to the existing model parameters. We find that projective methods can be effective at both intrinsic bias and downstream bias mitigation, but that the two outcomes are not necessarily correlated.
arXiv Detail & Related papers (2024-03-27T17:49:31Z)
Training Unbiased Diffusion Models From Biased Dataset [18.09610829650175]
This paper proposes time-dependent importance reweighting to mitigate the bias for diffusion models. We demonstrate that the time-dependent density ratio becomes more precise than previous approaches. While directly applying it to score-matching is intractable, we discover that using the time-dependent density ratio both for reweighting and score correction can lead to a tractable form of the objective function.
arXiv Detail & Related papers (2024-03-02T12:06:42Z)
The Gaps between Pre-train and Downstream Settings in Bias Evaluation and Debiasing [74.7319697510621]
In-Context Learning (ICL) induces smaller changes to PLMs compared to FT-based debiasing methods. ICL-based debiasing methods show a higher correlation between intrinsic and extrinsic bias scores compared to FT-based methods.
arXiv Detail & Related papers (2024-01-16T17:15:08Z)
IBADR: an Iterative Bias-Aware Dataset Refinement Framework for Debiasing NLU models [52.03761198830643]
We propose IBADR, an Iterative Bias-Aware dataset Refinement framework. We first train a shallow model to quantify the bias degree of samples in the pool. Then, we pair each sample with a bias indicator representing its bias degree, and use these extended samples to train a sample generator. In this way, this generator can effectively learn the correspondence relationship between bias indicators and samples.
arXiv Detail & Related papers (2023-11-01T04:50:38Z)
An Empirical Analysis of Parameter-Efficient Methods for Debiasing Pre-Trained Language Models [55.14405248920852]
We conduct experiments with prefix tuning, prompt tuning, and adapter tuning on different language models and bias types to evaluate their debiasing performance. We find that the parameter-efficient methods are effective in mitigating gender bias, where adapter tuning is consistently the most effective. We also find that prompt tuning is more suitable for GPT-2 than BERT, and racial and religious bias is less effective when it comes to racial and religious bias.
arXiv Detail & Related papers (2023-06-06T23:56:18Z)
Prediction-Oriented Bayesian Active Learning [51.426960808684655]
Expected predictive information gain (EPIG) is an acquisition function that measures information gain in the space of predictions rather than parameters. EPIG leads to stronger predictive performance compared with BALD across a range of datasets and models.
arXiv Detail & Related papers (2023-04-17T10:59:57Z)
Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets. We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias. DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z)
Extracting or Guessing? Improving Faithfulness of Event Temporal Relation Extraction [87.04153383938969]
We improve the faithfulness of TempRel extraction models from two perspectives. The first perspective is to extract genuinely based on contextual description. The second perspective is to provide proper uncertainty estimation.
arXiv Detail & Related papers (2022-10-10T19:53:13Z)
Approximate Bayesian Computation for Physical Inverse Modeling [0.32771631221674324]
We propose a new method for automating the model parameter extraction process resulting in an accurate model fitting. It is shown that the extracted parameters can be accurately predicted from the mobility curves using gradient boosted trees. This work also provides a comparative analysis of the proposed framework with fine-tuned neural networks wherein the proposed framework is shown to perform better.
arXiv Detail & Related papers (2021-11-26T02:23:05Z)
On Transferability of Bias Mitigation Effects in Language Model Fine-Tuning [30.833538367971872]
Fine-tuned language models have been shown to exhibit biases against protected groups in a host of modeling tasks. Previous works focus on detecting these biases, reducing bias in data representations, and using auxiliary training objectives to mitigate bias during fine-tuning. We explore the feasibility and benefits of upstream bias mitigation (UBM) for reducing bias on downstream tasks.
arXiv Detail & Related papers (2020-10-24T10:36:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.