Related papers: Overwriting Pretrained Bias with Finetuning Data

Overwriting Pretrained Bias with Finetuning Data

URL: http://arxiv.org/abs/2303.06167v2
Date: Thu, 17 Aug 2023 02:01:07 GMT
Title: Overwriting Pretrained Bias with Finetuning Data
Authors: Angelina Wang and Olga Russakovsky
Abstract summary: We investigate bias when conceptualized as both spurious correlations between the target task and a sensitive attribute as well as underrepresentation of a particular group in the dataset. We find that models finetuned on top of pretrained models can indeed inherit their biases, but (2) this bias can be corrected for through relatively minor interventions to the finetuning dataset. Our findings imply that careful curation of the finetuning dataset is important for reducing biases on a downstream task, and doing so can even compensate for bias in the pretrained model.
Score: 36.050345384273655
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transfer learning is beneficial by allowing the expressive features of models pretrained on large-scale datasets to be finetuned for the target task of smaller, more domain-specific datasets. However, there is a concern that these pretrained models may come with their own biases which would propagate into the finetuned model. In this work, we investigate bias when conceptualized as both spurious correlations between the target task and a sensitive attribute as well as underrepresentation of a particular group in the dataset. Under both notions of bias, we find that (1) models finetuned on top of pretrained models can indeed inherit their biases, but (2) this bias can be corrected for through relatively minor interventions to the finetuning dataset, and often with a negligible impact to performance. Our findings imply that careful curation of the finetuning dataset is important for reducing biases on a downstream task, and doing so can even compensate for bias in the pretrained model.

Related papers

Model Debiasing by Learnable Data Augmentation [19.625915578646758]
This paper proposes a novel 2-stage learning pipeline featuring a data augmentation strategy able to regularize the training. Experiments on synthetic and realistic biased datasets show state-of-the-art classification accuracy, outperforming competing methods.
arXiv Detail & Related papers (2024-08-09T09:19:59Z)
Low-rank finetuning for LLMs: A fairness perspective [54.13240282850982]
Low-rank approximation techniques have become the de facto standard for fine-tuning Large Language Models. This paper investigates the effectiveness of these methods in capturing the shift of fine-tuning datasets from the initial pre-trained data distribution. We show that low-rank fine-tuning inadvertently preserves undesirable biases and toxic behaviors.
arXiv Detail & Related papers (2024-05-28T20:43:53Z)
Ask Your Distribution Shift if Pre-Training is Right for You [74.18516460467019]
In practice, fine-tuning a pre-trained model improves robustness significantly in some cases but not at all in others. We focus on two possible failure modes of models under distribution shift: poor extrapolation and biases in the training data. Our study suggests that, as a rule of thumb, pre-training can help mitigate poor extrapolation but not dataset biases.
arXiv Detail & Related papers (2024-02-29T23:46:28Z)
Improving Bias Mitigation through Bias Experts in Natural Language Understanding [10.363406065066538]
We propose a new debiasing framework that introduces binary classifiers between the auxiliary model and the main model. Our proposed strategy improves the bias identification ability of the auxiliary model.
arXiv Detail & Related papers (2023-12-06T16:15:00Z)
Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data. We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations. Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z)
Fighting Bias with Bias: Promoting Model Robustness by Amplifying Dataset Biases [5.997909991352044]
Recent work sought to develop robust, unbiased models by filtering biased examples from training sets. We argue that such filtering can obscure the true capabilities of models to overcome biases. We introduce an evaluation framework defined by a bias-amplified training set and an anti-biased test set.
arXiv Detail & Related papers (2023-05-30T10:10:42Z)
Variation of Gender Biases in Visual Recognition Models Before and After Finetuning [29.55318393877906]
We introduce a framework to measure how biases change before and after fine-tuning a large scale visual recognition model for a downstream task. We find that supervised models trained on datasets such as ImageNet-21k are more likely to retain their pretraining biases. We also find that models finetuned on larger scale datasets are more likely to introduce new biased associations.
arXiv Detail & Related papers (2023-03-14T03:42:47Z)
General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space. GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z)
Learning from others' mistakes: Avoiding dataset biases without modeling them [111.17078939377313]
State-of-the-art natural language processing (NLP) models often learn to model dataset biases and surface form correlations instead of features that target the intended task. Previous work has demonstrated effective methods to circumvent these issues when knowledge of the bias is available. We show a method for training models that learn to ignore these problematic correlations.
arXiv Detail & Related papers (2020-12-02T16:10:54Z)
Improving Robustness by Augmenting Training Sentences with Predicate-Argument Structures [62.562760228942054]
Existing approaches to improve robustness against dataset biases mostly focus on changing the training objective. We propose to augment the input sentences in the training data with their corresponding predicate-argument structures. We show that without targeting a specific bias, our sentence augmentation improves the robustness of transformer models against multiple biases.
arXiv Detail & Related papers (2020-10-23T16:22:05Z)
Elastic weight consolidation for better bias inoculation [24.12790037712358]
Elastic weight consolidation (EWC) allows fine-tuning of models to mitigate biases. EWC dominates standard fine-tuning, yielding models with lower levels of forgetting on the original (biased) dataset.
arXiv Detail & Related papers (2020-04-29T17:45:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.