Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction
- URL: http://arxiv.org/abs/2403.09963v2
- Date: Tue, 26 Mar 2024 04:08:47 GMT
- Title: Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction
- Authors: Ziyang Xu, Keqin Peng, Liang Ding, Dacheng Tao, Xiliang Lu,
- Abstract summary: Recent research shows that pre-trained language models (PLMs) suffer from "prompt bias" in factual knowledge extraction.
This paper aims to improve the reliability of existing benchmarks by thoroughly investigating and mitigating prompt bias.
- Score: 56.17020601803071
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent research shows that pre-trained language models (PLMs) suffer from "prompt bias" in factual knowledge extraction, i.e., prompts tend to introduce biases toward specific labels. Prompt bias presents a significant challenge in assessing the factual knowledge within PLMs. Therefore, this paper aims to improve the reliability of existing benchmarks by thoroughly investigating and mitigating prompt bias. We show that: 1) all prompts in the experiments exhibit non-negligible bias, with gradient-based prompts like AutoPrompt and OptiPrompt displaying significantly higher levels of bias; 2) prompt bias can amplify benchmark accuracy unreasonably by overfitting the test datasets, especially on imbalanced datasets like LAMA. Based on these findings, we propose a representation-based approach to mitigate the prompt bias during inference time. Specifically, we first estimate the biased representation using prompt-only querying, and then remove it from the model's internal representations to generate the debiased representations, which are used to produce the final debiased outputs. Experiments across various prompts, PLMs, and benchmarks show that our approach can not only correct the overfitted performance caused by prompt bias, but also significantly improve the prompt retrieval capability (up to 10% absolute performance gain). These results indicate that our approach effectively alleviates prompt bias in knowledge evaluation, thereby enhancing the reliability of benchmark assessments. Hopefully, our plug-and-play approach can be a golden standard to strengthen PLMs toward reliable knowledge bases. Code and data are released in https://github.com/FelliYang/PromptBias.
Related papers
- Reasoning Beyond Bias: A Study on Counterfactual Prompting and Chain of Thought Reasoning [0.0]
We show that differences in learned regularities across answer options are predictive of model preferences and mirror human test-taking strategies.
We introduce two novel methods: Counterfactual Prompting with Chain of Thought (CoT) and Counterfactual Prompting with Agnostically Primed CoT (APriCoT)
Our results suggest that mitigating bias requires a "System-2" like process and that CoT reasoning is susceptible to confirmation bias under some prompting methodologies.
arXiv Detail & Related papers (2024-08-16T10:34:50Z) - Editable Fairness: Fine-Grained Bias Mitigation in Language Models [52.66450426729818]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases.
FAST surpasses state-of-the-art baselines with superior debiasing performance.
This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z) - Language-guided Detection and Mitigation of Unknown Dataset Bias [23.299264313976213]
We propose a framework to identify potential biases as keywords without prior knowledge based on the partial occurrence in the captions.
Our framework not only outperforms existing methods without prior knowledge, but also is even comparable with a method that assumes prior knowledge.
arXiv Detail & Related papers (2024-06-05T03:11:33Z) - Beyond Performance: Quantifying and Mitigating Label Bias in LLMs [8.77694178599322]
We evaluate different approaches to quantifying label bias in a model's predictions.
Our investigation reveals substantial label bias in models both before and after debiasing attempts.
We propose a novel label bias calibration method tailored for few-shot prompting.
arXiv Detail & Related papers (2024-05-04T19:53:03Z) - GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language
Models [83.30078426829627]
Large language models (LLMs) have gained popularity and are being widely adopted by a large user community.
The existing evaluation methods have many constraints, and their results exhibit a limited degree of interpretability.
We propose a bias evaluation framework named GPTBIAS that leverages the high performance of LLMs to assess bias in models.
arXiv Detail & Related papers (2023-12-11T12:02:14Z) - Fairness-guided Few-shot Prompting for Large Language Models [93.05624064699965]
In-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats.
We introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes.
We propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning.
arXiv Detail & Related papers (2023-03-23T12:28:25Z) - Unbiased Learning to Rank with Biased Continuous Feedback [5.561943356123711]
Unbiased learning-to-rank(LTR) algorithms are verified to model the relative relevance accurately based on noisy feedback.
To provide personalized high-quality recommendation results, recommender systems need model both categorical and continuous biased feedback.
We introduce the pairwise trust bias to separate the position bias, trust bias, and user relevance explicitly.
Experiment results on public benchmark datasets and internal live traffic of a large-scale recommender system at Tencent News show superior results for continuous labels.
arXiv Detail & Related papers (2023-03-08T02:14:08Z) - ADEPT: A DEbiasing PrompT Framework [49.582497203415855]
Finetuning is an applicable approach for debiasing contextualized word embeddings.
discrete prompts with semantic meanings have shown to be effective in debiasing tasks.
We propose ADEPT, a method to debias PLMs using prompt tuning while maintaining the delicate balance between removing biases and ensuring representation ability.
arXiv Detail & Related papers (2022-11-10T08:41:40Z) - Whole Page Unbiased Learning to Rank [59.52040055543542]
Unbiased Learning to Rank(ULTR) algorithms are proposed to learn an unbiased ranking model with biased click data.
We propose a Bias Agnostic whole-page unbiased Learning to rank algorithm, named BAL, to automatically find the user behavior model.
Experimental results on a real-world dataset verify the effectiveness of the BAL.
arXiv Detail & Related papers (2022-10-19T16:53:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.