Irrelevant Alternatives Bias Large Language Model Hiring Decisions
- URL: http://arxiv.org/abs/2409.15299v1
- Date: Wed, 4 Sep 2024 10:37:36 GMT
- Title: Irrelevant Alternatives Bias Large Language Model Hiring Decisions
- Authors: Kremena Valkanova, Pencho Yordanov,
- Abstract summary: The attraction effect occurs when the presence of an inferior candidate makes a superior candidate more appealing.
Our study finds consistent and significant evidence of the attraction effect in GPT-3.5 and GPT-4 when they assume the role of a recruiter.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate whether LLMs display a well-known human cognitive bias, the attraction effect, in hiring decisions. The attraction effect occurs when the presence of an inferior candidate makes a superior candidate more appealing, increasing the likelihood of the superior candidate being chosen over a non-dominated competitor. Our study finds consistent and significant evidence of the attraction effect in GPT-3.5 and GPT-4 when they assume the role of a recruiter. Irrelevant attributes of the decoy, such as its gender, further amplify the observed bias. GPT-4 exhibits greater bias variation than GPT-3.5. Our findings remain robust even when warnings against the decoy effect are included and the recruiter role definition is varied.
Related papers
- Revealing Hidden Bias in AI: Lessons from Large Language Models [0.0]
This study examines biases in candidate interview reports generated by Claude 3.5 Sonnet, GPT-4o, Gemini 1.5, and Llama 3.1 405B.
We evaluate the effectiveness of LLM-based anonymization in reducing these biases.
arXiv Detail & Related papers (2024-10-22T11:58:54Z) - Identifying the sources of ideological bias in GPT models through linguistic variation in output [0.0]
We use linguistic variation in countries with contrasting political attitudes to evaluate bias in GPT responses to sensitive political topics.
We find GPT output is more conservative in languages that map well onto conservative societies.
differences across languages observed in GPT-3.5 persist in GPT-4, even though GPT-4 is significantly more liberal due to OpenAI's filtering policy.
arXiv Detail & Related papers (2024-09-09T20:11:08Z) - Are Large Language Models Strategic Decision Makers? A Study of Performance and Bias in Two-Player Non-Zero-Sum Games [56.70628673595041]
Large Language Models (LLMs) have been increasingly used in real-world settings, yet their strategic decision-making abilities remain largely unexplored.
This work investigates the performance and merits of LLMs in canonical game-theoretic two-player non-zero-sum games, Stag Hunt and Prisoner Dilemma.
Our structured evaluation of GPT-3.5, GPT-4-Turbo, GPT-4o, and Llama-3-8B shows that these models, when making decisions in these games, are affected by at least one of the following systematic biases.
arXiv Detail & Related papers (2024-07-05T12:30:02Z) - An Empirical Analysis on Large Language Models in Debate Evaluation [10.677407097411768]
We investigate the capabilities and inherent biases of advanced large language models (LLMs) such as GPT-3.5 and GPT-4 in the context of debate evaluation.
We uncover a consistent bias in both GPT-3.5 and GPT-4 towards the second candidate response presented.
We also uncover lexical biases in both GPT-3.5 and GPT-4, especially when label sets carry connotations such as numerical or sequential.
arXiv Detail & Related papers (2024-05-28T18:34:53Z) - A First Look at Selection Bias in Preference Elicitation for Recommendation [64.44255178199846]
We study the effect of selection bias in preference elicitation on the resulting recommendations.
A big hurdle is the lack of any publicly available dataset that has preference elicitation interactions.
We propose a simulation of a topic-based preference elicitation process.
arXiv Detail & Related papers (2024-05-01T14:56:56Z) - What's in a Name? Auditing Large Language Models for Race and Gender
Bias [49.28899492966893]
We employ an audit design to investigate biases in state-of-the-art large language models, including GPT-4.
We find that the advice systematically disadvantages names that are commonly associated with racial minorities and women.
arXiv Detail & Related papers (2024-02-21T18:25:25Z) - Behind the Screen: Investigating ChatGPT's Dark Personality Traits and
Conspiracy Beliefs [0.0]
This paper analyzes the dark personality traits and conspiracy beliefs of GPT-3.5 and GPT-4.
Dark personality traits and conspiracy beliefs were not particularly pronounced in either model.
arXiv Detail & Related papers (2024-02-06T16:03:57Z) - Identifying and Improving Disability Bias in GPT-Based Resume Screening [9.881826151448198]
We ask ChatGPT to rank a resume against the same resume enhanced with an additional leadership award, scholarship, panel presentation, and membership that are disability related.
We find that GPT-4 exhibits prejudice towards these enhanced CVs.
We show that this prejudice can be quantifiably reduced by training a custom GPTs on principles of DEI and disability justice.
arXiv Detail & Related papers (2024-01-28T17:04:59Z) - Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs [67.51906565969227]
We study the unintended side-effects of persona assignment on the ability of LLMs to perform basic reasoning tasks.
Our study covers 24 reasoning datasets, 4 LLMs, and 19 diverse personas (e.g. an Asian person) spanning 5 socio-demographic groups.
arXiv Detail & Related papers (2023-11-08T18:52:17Z) - Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias [57.42417061979399]
Recent studies show that instruction tuning (IT) and reinforcement learning from human feedback (RLHF) improve the abilities of large language models (LMs) dramatically.
In this work, we investigate the effect of IT and RLHF on decision making and reasoning in LMs.
Our findings highlight the presence of these biases in various models from the GPT-3, Mistral, and T5 families.
arXiv Detail & Related papers (2023-08-01T01:39:25Z) - Cross Pairwise Ranking for Unbiased Item Recommendation [57.71258289870123]
We develop a new learning paradigm named Cross Pairwise Ranking (CPR)
CPR achieves unbiased recommendation without knowing the exposure mechanism.
We prove in theory that this way offsets the influence of user/item propensity on the learning.
arXiv Detail & Related papers (2022-04-26T09:20:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.