Prompt-Based Bias Calibration for Better Zero/Few-Shot Learning of Language Models
- URL: http://arxiv.org/abs/2402.10353v2
- Date: Sun, 06 Oct 2024 16:27:01 GMT
- Title: Prompt-Based Bias Calibration for Better Zero/Few-Shot Learning of Language Models
- Authors: Kang He, Yinghan Long, Kaushik Roy,
- Abstract summary: We propose a null-input prompting method to calibrate intrinsic bias encoded in pre-trained language models.
Our method significantly improves zero/few-shot learning performance of LMs for both in-context learning and prompt-based fine-tuning.
- Score: 7.089534153472173
- License:
- Abstract: Prompt-based learning is susceptible to intrinsic bias present in pre-trained language models (LMs), leading to sub-optimal performance in prompt-based zero/few-shot settings. In this work, we propose a null-input prompting method to calibrate intrinsic bias encoded in pre-trained LMs. Different from prior efforts that address intrinsic bias primarily for social fairness and often involve excessive computational cost, our objective is to explore enhancing LMs' performance in downstream zero/few-shot learning while emphasizing the efficiency of intrinsic bias calibration. Specifically, we leverage a diverse set of auto-selected null-meaning inputs generated from GPT-4 to probe intrinsic bias of pre-trained LMs. Utilizing the bias-reflected probability distribution, we formulate a distribution disparity loss for bias calibration, where we exclusively update bias parameters ($0.1\%$ of total parameters) of LMs towards equal probability distribution. Experimental results show that the calibration promotes an equitable starting point for LMs while preserving language modeling abilities. Across a wide range of datasets, including sentiment analysis and topic classification, our method significantly improves zero/few-shot learning performance of LMs for both in-context learning and prompt-based fine-tuning (on average $9\%$ and $2\%$, respectively).
Related papers
- Post-hoc Reward Calibration: A Case Study on Length Bias [28.266675778940133]
Reward models (RMs) can develop biases by exploiting spurious correlations in their training data.
These biases can lead to incorrect output rankings, sub-optimal model evaluations, and the amplification of undesirable behaviours.
This paper addresses the challenge of correcting such biases without additional data and training.
arXiv Detail & Related papers (2024-09-25T22:30:42Z) - Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - Projective Methods for Mitigating Gender Bias in Pre-trained Language Models [10.418595661963062]
Projective methods are fast to implement, use a small number of saved parameters, and make no updates to the existing model parameters.
We find that projective methods can be effective at both intrinsic bias and downstream bias mitigation, but that the two outcomes are not necessarily correlated.
arXiv Detail & Related papers (2024-03-27T17:49:31Z) - Self-Debiasing Large Language Models: Zero-Shot Recognition and
Reduction of Stereotypes [73.12947922129261]
We leverage the zero-shot capabilities of large language models to reduce stereotyping.
We show that self-debiasing can significantly reduce the degree of stereotyping across nine different social groups.
We hope this work opens inquiry into other zero-shot techniques for bias mitigation.
arXiv Detail & Related papers (2024-02-03T01:40:11Z) - Bias Mitigating Few-Shot Class-Incremental Learning [17.185744533050116]
Few-shot class-incremental learning aims at recognizing novel classes continually with limited novel class samples.
Recent methods somewhat alleviate the accuracy imbalance between base and incremental classes by fine-tuning the feature extractor in the incremental sessions.
We propose a novel method to mitigate model bias of the FSCIL problem during training and inference processes.
arXiv Detail & Related papers (2024-02-01T10:37:41Z) - The Gaps between Pre-train and Downstream Settings in Bias Evaluation
and Debiasing [74.7319697510621]
In-Context Learning (ICL) induces smaller changes to PLMs compared to FT-based debiasing methods.
ICL-based debiasing methods show a higher correlation between intrinsic and extrinsic bias scores compared to FT-based methods.
arXiv Detail & Related papers (2024-01-16T17:15:08Z) - Self-Supervised Position Debiasing for Large Language Models [39.261233221850155]
We propose a self-supervised position debiasing (SOD) framework to mitigate position bias for large language models (LLMs)
Experiments on eight datasets and five tasks show that SOD consistently outperforms existing methods in mitigating three types of position biases.
arXiv Detail & Related papers (2024-01-02T14:12:41Z) - Batch Calibration: Rethinking Calibration for In-Context Learning and
Prompt Engineering [12.967536233145614]
Batch (BC) is a simple yet intuitive method that controls the contextual bias from the batched input.
BC is zero-shot, inference-only, and incurs negligible additional costs.
We demonstrate state-of-the-art performance over previous calibration baselines across more than 10 natural language understanding and image classification tasks.
arXiv Detail & Related papers (2023-09-29T13:55:45Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - Language Model Pre-training on True Negatives [109.73819321246062]
Discriminative pre-trained language models (PLMs) learn to predict original texts from intentionally corrupted ones.
Existing PLMs simply treat all corrupted texts as equal negative without any examination.
We design enhanced pre-training methods to counteract false negative predictions and encourage pre-training language models on true negatives.
arXiv Detail & Related papers (2022-12-01T12:24:19Z) - Learnable Distribution Calibration for Few-Shot Class-Incremental
Learning [122.2241120474278]
Few-shot class-incremental learning (FSCIL) faces challenges of memorizing old class distributions and estimating new class distributions given few training samples.
We propose a learnable distribution calibration (LDC) approach, with the aim to systematically solve these two challenges using a unified framework.
arXiv Detail & Related papers (2022-10-01T09:40:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.