Related papers: Mitigating Selection Bias with Node Pruning and Auxiliary Options

Mitigating Selection Bias with Node Pruning and Auxiliary Options

URL: http://arxiv.org/abs/2409.18857v1
Date: Fri, 27 Sep 2024 15:53:54 GMT
Title: Mitigating Selection Bias with Node Pruning and Auxiliary Options
Authors: Hyeong Kyu Choi, Weijie Xu, Chi Xue, Stephanie Eckman, Chandan K. Reddy,
Abstract summary: Large language models (LLMs) often show unwarranted preference for certain choice options when responding to multiple-choice questions. Previous solutions utilized debiasing methods to adjust the model's input and/or output. Our work, in contrast, investigates the model's internal representation of the selection bias.
Score: 11.835002896308545
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large language models (LLMs) often show unwarranted preference for certain choice options when responding to multiple-choice questions, posing significant reliability concerns in LLM-automated systems. To mitigate this selection bias problem, previous solutions utilized debiasing methods to adjust the model's input and/or output. Our work, in contrast, investigates the model's internal representation of the selection bias. Specifically, we introduce a novel debiasing approach, Bias Node Pruning (BNP), which eliminates the linear layer parameters that contribute to the bias. Furthermore, we present Auxiliary Option Injection (AOI), a simple yet effective input modification technique for debiasing, which is compatible even with black-box LLMs. To provide a more systematic evaluation of selection bias, we review existing metrics and introduce Choice Kullback-Leibler Divergence (CKLD), which addresses the insensitivity of the commonly used metrics to label imbalance. Experiments show that our methods are robust and adaptable across various datasets when applied to three LLMs.

Related papers

SCOPE: Stochastic and Counterbiased Option Placement for Evaluating Large Language Models [0.27309692684728604]
Large Language Models (LLMs) can achieve inflated scores on multiple-choice tasks by exploiting inherent biases in option positions or labels.<n>This study introduces SCOPE, an evaluation framework designed to measure and mitigate such selection bias in a dataset-independent manner.
arXiv Detail & Related papers (2025-07-24T08:28:17Z)
A Principled Approach to Randomized Selection under Uncertainty: Applications to Peer Review and Grant Funding [68.43987626137512]
We propose a principled framework for randomized decision-making based on interval estimates of the quality of each item.<n>We introduce MERIT, an optimization-based method that maximizes the worst-case expected number of top candidates selected.<n>We prove that MERIT satisfies desirable axiomatic properties not guaranteed by existing approaches.
arXiv Detail & Related papers (2025-06-23T19:59:30Z)
Fragile Preferences: A Deep Dive Into Order Effects in Large Language Models [2.3936613583728064]
We provide the first comprehensive investigation of positional biases across multiple large language models (LLMs)<n>We find a quality-dependent shift: when options are high quality, models exhibit primacy bias, but favor latter options when option quality is low.<n>To distinguish superficial tie-breaking from true distortions of judgment, we introduce a framework that classifies pairwise preferences as robust, fragile, or indifferent.
arXiv Detail & Related papers (2025-06-17T01:14:22Z)
Cognitive Debiasing Large Language Models for Decision-Making [71.2409973056137]
Large language models (LLMs) have shown potential in supporting decision-making applications. We propose a cognitive debiasing approach, called self-debiasing, that enhances the reliability of LLMs. Our method follows three sequential steps -- bias determination, bias analysis, and cognitive debiasing -- to iteratively mitigate potential cognitive biases in prompts.
arXiv Detail & Related papers (2025-04-05T11:23:05Z)
Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes [50.544186914115045]
Large language models (LLMs) are increasingly embedded in everyday applications. Ensuring their alignment with the diverse preferences of individual users has become a critical challenge. We present a novel framework for few-shot steerable alignment.
arXiv Detail & Related papers (2024-12-18T16:14:59Z)
CalibraEval: Calibrating Prediction Distribution to Mitigate Selection Bias in LLMs-as-Judges [21.580762639442913]
We introduce CalibraEval, a novel label-free method for mitigating selection bias during inference. CalibraEval reformulates debiasing as an optimization task aimed at adjusting observed prediction distributions to align with unbiased prediction distributions. We show that CalibraEval effectively mitigates selection bias and improves performance compared to existing debiasing methods.
arXiv Detail & Related papers (2024-10-20T13:47:39Z)
Addressing Blind Guessing: Calibration of Selection Bias in Multiple-Choice Question Answering by Video Language Models [16.34646723046073]
Video Language Models (VLMs) are designed to answer complex video-focused questions. Current benchmarks fail to capture the full reasoning capabilities of VLMs due to selection bias. This study is the first focused investigation of selection bias in video-to-text LLM-powered models.
arXiv Detail & Related papers (2024-10-18T07:52:22Z)
Evaluating Nuanced Bias in Large Language Model Free Response Answers [8.775925011558995]
We identify several kinds of nuanced bias in free text that cannot be identified by multiple choice tests. We present a semi-automated pipeline for detecting these types of bias by first eliminating answers that can be automatically classified as unbiased.
arXiv Detail & Related papers (2024-07-11T19:58:13Z)
Going Beyond Popularity and Positivity Bias: Correcting for Multifactorial Bias in Recommender Systems [74.47680026838128]
Two typical forms of bias in user interaction data with recommender systems (RSs) are popularity bias and positivity bias. We consider multifactorial selection bias affected by both item and rating value factors. We propose smoothing and alternating gradient descent techniques to reduce variance and improve the robustness of its optimization.
arXiv Detail & Related papers (2024-04-29T12:18:21Z)
Debiasing Multimodal Large Language Models [61.6896704217147]
Large Vision-Language Models (LVLMs) have become indispensable tools in computer vision and natural language processing. Our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior to the input image. To rectify these biases and redirect the model's focus toward vision information, we introduce two simple, training-free strategies.
arXiv Detail & Related papers (2024-03-08T12:35:07Z)
The Gaps between Pre-train and Downstream Settings in Bias Evaluation and Debiasing [74.7319697510621]
In-Context Learning (ICL) induces smaller changes to PLMs compared to FT-based debiasing methods. ICL-based debiasing methods show a higher correlation between intrinsic and extrinsic bias scores compared to FT-based methods.
arXiv Detail & Related papers (2024-01-16T17:15:08Z)
Self-Supervised Position Debiasing for Large Language Models [39.261233221850155]
We propose a self-supervised position debiasing (SOD) framework to mitigate position bias for large language models (LLMs) Experiments on eight datasets and five tasks show that SOD consistently outperforms existing methods in mitigating three types of position biases.
arXiv Detail & Related papers (2024-01-02T14:12:41Z)
Large Language Models Are Not Robust Multiple Choice Selectors [117.72712117510953]
Multiple choice questions (MCQs) serve as a common yet important task format in the evaluation of large language models (LLMs) This work shows that modern LLMs are vulnerable to option position changes due to their inherent "selection bias" We propose a label-free, inference-time debiasing method, called PriDe, which separates the model's prior bias for option IDs from the overall prediction distribution.
arXiv Detail & Related papers (2023-09-07T17:44:56Z)
Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy. We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples. Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z)
Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets. We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias. DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z)
ADEPT: A DEbiasing PrompT Framework [49.582497203415855]
Finetuning is an applicable approach for debiasing contextualized word embeddings. discrete prompts with semantic meanings have shown to be effective in debiasing tasks. We propose ADEPT, a method to debias PLMs using prompt tuning while maintaining the delicate balance between removing biases and ensuring representation ability.
arXiv Detail & Related papers (2022-11-10T08:41:40Z)
Error-based Knockoffs Inference for Controlled Feature Selection [49.99321384855201]
We propose an error-based knockoff inference method by integrating the knockoff features, the error-based feature importance statistics, and the stepdown procedure together. The proposed inference procedure does not require specifying a regression model and can handle feature selection with theoretical guarantees.
arXiv Detail & Related papers (2022-03-09T01:55:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.