Related papers: Exploiting Primacy Effect To Improve Large Language Models

Exploiting Primacy Effect To Improve Large Language Models

URL: http://arxiv.org/abs/2507.13949v1
Date: Fri, 18 Jul 2025 14:18:18 GMT
Title: Exploiting Primacy Effect To Improve Large Language Models
Authors: Bianca Raimondi, Maurizio Gabbrielli,
Abstract summary: This study focuses on primacy bias in fine-tuned Large Language Models (LLMs)<n>We first show that fine-tuning amplifies this bias, probably due to exposure to human-like patterns.<n>We strategically leverage this effect by reordering response options based on semantic similarity to the query, without requiring knowledge of the correct answer.
Score: 1.03590082373586
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large Language Models (LLMs) have become essential in many Natural Language Processing (NLP) tasks, leveraging extensive pre-training and fine-tuning to achieve high accuracy. However, like humans, LLMs exhibit biases, particularly positional biases such as primacy and recency effects, which can influence the accuracy of the answers. The primacy effect-where items presented first are more likely to be remembered or selected-plays a key role in Multiple Choice Question Answering (MCQA), where the order of answer options can affect prediction outcomes. This study focuses on primacy bias in fine-tuned LLMs: We first show that fine-tuning amplifies this bias, probably due to exposure to human-like patterns. Hence, we strategically leverage this effect by reordering response options based on semantic similarity to the query, without requiring knowledge of the correct answer. Our experimental results show that this approach significantly improves performance in MCQA. More generally, our findings underscore the dual nature of biases as both challenges and opportunities, offering insights for bias-aware model design and NLP applications.

Related papers

Reward-Augmented Data Enhances Direct Preference Alignment of LLMs [63.32585910975191]
We introduce reward-conditioned Large Language Models (LLMs) that learn from the entire spectrum of response quality within the dataset.<n>We show that our approach consistently boosts DPO by a considerable margin.<n>Our method not only maximizes the utility of preference data but also mitigates the issue of unlearning, demonstrating its broad effectiveness beyond mere data expansion.
arXiv Detail & Related papers (2024-10-10T16:01:51Z)
LLM-Select: Feature Selection with Large Language Models [64.5099482021597]
Large language models (LLMs) are capable of selecting the most predictive features, with performance rivaling the standard tools of data science.<n>Our findings suggest that LLMs may be useful not only for selecting the best features for training but also for deciding which features to collect in the first place.
arXiv Detail & Related papers (2024-07-02T22:23:40Z)
Serial Position Effects of Large Language Models [29.111115148808196]
Large Language Models (LLMs) have shown remarkable capabilities in zero-shot learning applications. This represents a significant departure from traditional machine learning approaches. Previous research has indicated that LLMs may exhibit serial position effects, such as primacy and recency biases.
arXiv Detail & Related papers (2024-06-23T02:02:52Z)
A First Look at Selection Bias in Preference Elicitation for Recommendation [64.44255178199846]
We study the effect of selection bias in preference elicitation on the resulting recommendations. A big hurdle is the lack of any publicly available dataset that has preference elicitation interactions. We propose a simulation of a topic-based preference elicitation process.
arXiv Detail & Related papers (2024-05-01T14:56:56Z)
Debiasing Multimodal Large Language Models [61.6896704217147]
Large Vision-Language Models (LVLMs) have become indispensable tools in computer vision and natural language processing. Our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior to the input image. To rectify these biases and redirect the model's focus toward vision information, we introduce two simple, training-free strategies.
arXiv Detail & Related papers (2024-03-08T12:35:07Z)
Aligning Large Language Models by On-Policy Self-Judgment [49.31895979525054]
Existing approaches for aligning large language models with human preferences face a trade-off that requires a separate reward model (RM) for on-policy learning. We present a novel alignment framework, SELF-JUDGE, that does on-policy learning and is parameter efficient. We show that the rejecting sampling by itself can improve performance further without an additional evaluator.
arXiv Detail & Related papers (2024-02-17T11:25:26Z)
Reducing Selection Bias in Large Language Models [0.0]
Large Language Models (LLMs) are vital in interpreting and executing semantic tasks. This research critically examines these biases and quantifies the effects on a representative list selection task.
arXiv Detail & Related papers (2024-01-29T15:43:23Z)
HANS, are you clever? Clever Hans Effect Analysis of Neural Systems [1.6267479602370545]
Large Language Models (It-LLMs) have been exhibiting outstanding abilities to reason around cognitive states, intentions, and reactions of all people involved, letting humans guide and comprehend day-to-day social interactions effectively. Several multiple-choice questions (MCQ) benchmarks have been proposed to construct solid assessments of the models' abilities. However, earlier works are demonstrating the presence of inherent "order bias" in It-LLMs, posing challenges to the appropriate evaluation.
arXiv Detail & Related papers (2023-09-21T20:52:18Z)
Large Language Models Are Not Robust Multiple Choice Selectors [117.72712117510953]
Multiple choice questions (MCQs) serve as a common yet important task format in the evaluation of large language models (LLMs) This work shows that modern LLMs are vulnerable to option position changes due to their inherent "selection bias" We propose a label-free, inference-time debiasing method, called PriDe, which separates the model's prior bias for option IDs from the overall prediction distribution.
arXiv Detail & Related papers (2023-09-07T17:44:56Z)
Large Language Models Sensitivity to The Order of Options in Multiple-Choice Questions [5.187383020960245]
Large Language Models (LLMs) have demonstrated remarkable capabilities in various NLP tasks. Previous works have shown these models are sensitive towards prompt wording, and few-shot demonstrations and their order. This paper investigates sensitivity of LLMs towards the order of options in multiple-choice questions.
arXiv Detail & Related papers (2023-08-22T14:54:59Z)
An Empirical Study on the Language Modal in Visual Question Answering [31.692905677913068]
Generalization beyond in-domain experience to out-of-distribution data is of paramount significance in the AI domain. This paper attempts to provide new insights into the influence of language modality on VQA performance.
arXiv Detail & Related papers (2023-05-17T11:56:40Z)
Cross Pairwise Ranking for Unbiased Item Recommendation [57.71258289870123]
We develop a new learning paradigm named Cross Pairwise Ranking (CPR) CPR achieves unbiased recommendation without knowing the exposure mechanism. We prove in theory that this way offsets the influence of user/item propensity on the learning.
arXiv Detail & Related papers (2022-04-26T09:20:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.