An Empirical Analysis of Parameter-Efficient Methods for Debiasing
Pre-Trained Language Models
- URL: http://arxiv.org/abs/2306.04067v1
- Date: Tue, 6 Jun 2023 23:56:18 GMT
- Title: An Empirical Analysis of Parameter-Efficient Methods for Debiasing
Pre-Trained Language Models
- Authors: Zhongbin Xie, Thomas Lukasiewicz
- Abstract summary: We conduct experiments with prefix tuning, prompt tuning, and adapter tuning on different language models and bias types to evaluate their debiasing performance.
We find that the parameter-efficient methods are effective in mitigating gender bias, where adapter tuning is consistently the most effective.
We also find that prompt tuning is more suitable for GPT-2 than BERT, and racial and religious bias is less effective when it comes to racial and religious bias.
- Score: 55.14405248920852
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The increasingly large size of modern pretrained language models not only
makes them inherit more human-like biases from the training corpora, but also
makes it computationally expensive to mitigate such biases. In this paper, we
investigate recent parameter-efficient methods in combination with
counterfactual data augmentation (CDA) for bias mitigation. We conduct
extensive experiments with prefix tuning, prompt tuning, and adapter tuning on
different language models and bias types to evaluate their debiasing
performance and abilities to preserve the internal knowledge of a pre-trained
model. We find that the parameter-efficient methods (i) are effective in
mitigating gender bias, where adapter tuning is consistently the most effective
one and prompt tuning is more suitable for GPT-2 than BERT, (ii) are less
effective when it comes to racial and religious bias, which may be attributed
to the limitations of CDA, and (iii) can perform similarly to or sometimes
better than full fine-tuning with improved time and memory efficiency, as well
as maintain the internal knowledge in BERT and GPT-2, evaluated via fact
retrieval and downstream fine-tuning.
Related papers
- Forecast-PEFT: Parameter-Efficient Fine-Tuning for Pre-trained Motion Forecasting Models [68.23649978697027]
Forecast-PEFT is a fine-tuning strategy that freezes the majority of the model's parameters, focusing adjustments on newly introduced prompts and adapters.
Our experiments show that Forecast-PEFT outperforms traditional full fine-tuning methods in motion prediction tasks.
Forecast-FT further improves prediction performance, evidencing up to a 9.6% enhancement over conventional baseline methods.
arXiv Detail & Related papers (2024-07-28T19:18:59Z) - Projective Methods for Mitigating Gender Bias in Pre-trained Language Models [10.418595661963062]
Projective methods are fast to implement, use a small number of saved parameters, and make no updates to the existing model parameters.
We find that projective methods can be effective at both intrinsic bias and downstream bias mitigation, but that the two outcomes are not necessarily correlated.
arXiv Detail & Related papers (2024-03-27T17:49:31Z) - AAT: Adapting Audio Transformer for Various Acoustics Recognition Tasks [4.789838330230841]
We propose an efficient fine-tuning approach based on Adapter tuning, namely AAT.
Our method achieves performance comparable to or even superior to full fine-tuning while optimizing only 7.118% of the parameters.
arXiv Detail & Related papers (2024-01-19T08:07:59Z) - An Emulator for Fine-Tuning Large Language Models using Small Language
Models [91.02498576056057]
We introduce emulated fine-tuning (EFT), a principled and practical method for sampling from a distribution that approximates the result of pre-training and fine-tuning at different scales.
We show that EFT enables test-time adjustment of competing behavioral traits like helpfulness and harmlessness without additional training.
Finally, a special case of emulated fine-tuning, which we call LM up-scaling, avoids resource-intensive fine-tuning of large pre-trained models by ensembling them with small fine-tuned models.
arXiv Detail & Related papers (2023-10-19T17:57:16Z) - Stabilizing Subject Transfer in EEG Classification with Divergence
Estimation [17.924276728038304]
We propose several graphical models to describe an EEG classification task.
We identify statistical relationships that should hold true in an idealized training scenario.
We design regularization penalties to enforce these relationships in two stages.
arXiv Detail & Related papers (2023-10-12T23:06:52Z) - Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks.
We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z) - Effectiveness of Data Augmentation for Parameter Efficient Tuning with
Limited Data [30.869230680173825]
We show that data augmentation can be used to boost the performance of P-tuning and LoRA models.
We show how P-tuning presents a more limited ability to separate the sentence embeddings from different classes of augmented data.
arXiv Detail & Related papers (2023-03-05T04:12:17Z) - Parameter-efficient Modularised Bias Mitigation via AdapterFusion [22.424110883305243]
We propose a novel approach to develop stand-alone debiasing functionalities separate from the model.
We introduce DAM - a debiasing approach to first encapsulate arbitrary bias mitigation functionalities into separate adapters, and then add them to the model on-demand.
Our results show that DAM improves or maintains the effectiveness of bias mitigation, avoids forgetting in a multi-attribute scenario, and maintains on-par task performance.
arXiv Detail & Related papers (2023-02-13T12:39:45Z) - DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language
Models [152.29364079385635]
As pre-trained models grow bigger, the fine-tuning process can be time-consuming and computationally expensive.
We propose a framework for resource- and parameter-efficient fine-tuning by leveraging the sparsity prior in both weight updates and the final model weights.
Our proposed framework, dubbed Dually Sparsity-Embedded Efficient Tuning (DSEE), aims to achieve two key objectives: (i) parameter efficient fine-tuning and (ii) resource-efficient inference.
arXiv Detail & Related papers (2021-10-30T03:29:47Z) - BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based
Masked Language-models [51.53936551681613]
We show that fine-tuning only the bias terms (or a subset of the bias terms) of pre-trained BERT models is competitive with (and sometimes better than) fine-tuning the entire model.
They support the hypothesis that finetuning is mainly about exposing knowledge induced by language-modeling training, rather than learning new task-specific linguistic knowledge.
arXiv Detail & Related papers (2021-06-18T16:09:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.