Leveraging Biases in Large Language Models: "bias-kNN'' for Effective
Few-Shot Learning
- URL: http://arxiv.org/abs/2401.09783v1
- Date: Thu, 18 Jan 2024 08:05:45 GMT
- Title: Leveraging Biases in Large Language Models: "bias-kNN'' for Effective
Few-Shot Learning
- Authors: Yong Zhang, Hanzhang Li, Zhitao Li, Ning Cheng, Ming Li, Jing Xiao,
Jianzong Wang
- Abstract summary: This study introduces a novel methodology named bias-kNN''
This approach capitalizes on the biased outputs, harnessing them as primary features for kNN and supplementing with gold labels.
Our comprehensive evaluations, spanning diverse domain text classification datasets and different GPT-2 model sizes, indicate the adaptability and efficacy of the bias-kNN'' method.
- Score: 36.739829839357995
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have shown significant promise in various
applications, including zero-shot and few-shot learning. However, their
performance can be hampered by inherent biases. Instead of traditionally sought
methods that aim to minimize or correct these biases, this study introduces a
novel methodology named ``bias-kNN''. This approach capitalizes on the biased
outputs, harnessing them as primary features for kNN and supplementing with
gold labels. Our comprehensive evaluations, spanning diverse domain text
classification datasets and different GPT-2 model sizes, indicate the
adaptability and efficacy of the ``bias-kNN'' method. Remarkably, this approach
not only outperforms conventional in-context learning in few-shot scenarios but
also demonstrates robustness across a spectrum of samples, templates and
verbalizers. This study, therefore, presents a unique perspective on harnessing
biases, transforming them into assets for enhanced model performance.
Related papers
- Scalable Influence and Fact Tracing for Large Language Model Pretraining [14.598556308631018]
Training data attribution (TDA) methods aim to attribute model outputs back to specific training examples.
This paper refines existing gradient-based methods to work effectively at scale.
arXiv Detail & Related papers (2024-10-22T20:39:21Z) - GUS-Net: Social Bias Classification in Text with Generalizations, Unfairness, and Stereotypes [2.2162879952427343]
This paper introduces GUS-Net, an innovative approach to bias detection.
GUS-Net focuses on three key types of biases: (G)eneralizations, (U)nfairness, and (S)tereotypes.
Our methodology enhances traditional bias detection methods by incorporating the contextual encodings of pre-trained models.
arXiv Detail & Related papers (2024-10-10T21:51:22Z) - REFINE-LM: Mitigating Language Model Stereotypes via Reinforcement Learning [18.064064773660174]
We introduce REFINE-LM, a debiasing method that uses reinforcement learning to handle different types of biases without any fine-tuning.
By training a simple model on top of the word probability distribution of a LM, our bias reinforcement learning method enables model debiasing without human annotations.
Experiments conducted on a wide range of models, including several LMs, show that our method significantly reduces stereotypical biases while preserving LMs performance.
arXiv Detail & Related papers (2024-08-18T14:08:31Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - Investigating Ensemble Methods for Model Robustness Improvement of Text
Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases.
By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Learning from others' mistakes: Avoiding dataset biases without modeling
them [111.17078939377313]
State-of-the-art natural language processing (NLP) models often learn to model dataset biases and surface form correlations instead of features that target the intended task.
Previous work has demonstrated effective methods to circumvent these issues when knowledge of the bias is available.
We show a method for training models that learn to ignore these problematic correlations.
arXiv Detail & Related papers (2020-12-02T16:10:54Z) - An Empirical Study on Model-agnostic Debiasing Strategies for Robust
Natural Language Inference [37.420864237437804]
We focus on the model-agnostic debiasing strategies and explore how to make the NLI models robust to multiple adversarial attacks.
We first benchmark prevailing neural NLI models including pretrained ones on various adversarial datasets.
We then try to combat distinct known biases by modifying a mixture of experts (MoE) ensemble method.
arXiv Detail & Related papers (2020-10-08T05:40:45Z) - Towards Debiasing NLU Models from Unknown Biases [70.31427277842239]
NLU models often exploit biases to achieve high dataset-specific performance without properly learning the intended task.
We present a self-debiasing framework that prevents models from mainly utilizing biases without knowing them in advance.
arXiv Detail & Related papers (2020-09-25T15:49:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.