PapersPlease: A Benchmark for Evaluating Motivational Values of Large Language Models Based on ERG Theory
- URL: http://arxiv.org/abs/2506.21961v1
- Date: Fri, 27 Jun 2025 07:09:11 GMT
- Title: PapersPlease: A Benchmark for Evaluating Motivational Values of Large Language Models Based on ERG Theory
- Authors: Junho Myung, Yeon Su Park, Sunwoo Kim, Shin Yoo, Alice Oh,
- Abstract summary: We introduce PapersPlease, a benchmark consisting of 3,700 moral dilemmas designed to investigate large language models' decision-making.<n>In our setup, LLMs act as immigration inspectors deciding whether to approve or deny entry based on the short narratives of people.<n>Our analysis of six LLMs reveals statistically significant patterns in decision-making, suggesting that LLMs encode implicit preferences.
- Score: 24.290880164707122
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Evaluating the performance and biases of large language models (LLMs) through role-playing scenarios is becoming increasingly common, as LLMs often exhibit biased behaviors in these contexts. Building on this line of research, we introduce PapersPlease, a benchmark consisting of 3,700 moral dilemmas designed to investigate LLMs' decision-making in prioritizing various levels of human needs. In our setup, LLMs act as immigration inspectors deciding whether to approve or deny entry based on the short narratives of people. These narratives are constructed using the Existence, Relatedness, and Growth (ERG) theory, which categorizes human needs into three hierarchical levels. Our analysis of six LLMs reveals statistically significant patterns in decision-making, suggesting that LLMs encode implicit preferences. Additionally, our evaluation of the impact of incorporating social identities into the narratives shows varying responsiveness based on both motivational needs and identity cues, with some models exhibiting higher denial rates for marginalized identities. All data is publicly available at https://github.com/yeonsuuuu28/papers-please.
Related papers
- Alignment Revisited: Are Large Language Models Consistent in Stated and Revealed Preferences? [5.542420010310746]
A critical, yet understudied, issue is the potential divergence between an LLM's stated preferences and its revealed preferences.<n>This work formally defines and proposes a method to measure this preference deviation.<n>Our study will be crucial for integrating LLMs into services, especially those that interact directly with humans.
arXiv Detail & Related papers (2025-05-31T23:38:48Z) - Evaluating how LLM annotations represent diverse views on contentious topics [3.405231040967506]
We show that generative large language models (LLMs) tend to be biased in the same directions on the same demographic categories within the same datasets.<n>We conclude with a discussion of the implications for researchers and practitioners using LLMs for automated data annotation tasks.
arXiv Detail & Related papers (2025-03-29T22:53:15Z) - No LLM is Free From Bias: A Comprehensive Study of Bias Evaluation in Large Language Models [0.9620910657090186]
Large Language Models (LLMs) have increased the performance of different natural language understanding as well as generation tasks.<n>We provide a unified evaluation of benchmarks using a set of representative small and medium-sized LLMs.<n>We propose five prompting approaches to carry out the bias detection task across different aspects of bias.<n>The results indicate that each of the selected LLMs suffer from one or the other form of bias with the Phi-3.5B model being the least biased.
arXiv Detail & Related papers (2025-03-15T03:58:14Z) - Latent Factor Models Meets Instructions: Goal-conditioned Latent Factor Discovery without Task Supervision [50.45597801390757]
Instruct-LF is a goal-oriented latent factor discovery system.<n>It integrates instruction-following ability with statistical models to handle noisy datasets.
arXiv Detail & Related papers (2025-02-21T02:03:08Z) - Benchmarking Bias in Large Language Models during Role-Playing [21.28427555283642]
We introduce BiasLens, a fairness testing framework designed to expose biases in Large Language Models (LLMs) during role-playing.
Our approach uses LLMs to generate 550 social roles across a comprehensive set of 11 demographic attributes, producing 33,000 role-specific questions.
Using the generated questions as the benchmark, we conduct extensive evaluations of six advanced LLMs released by OpenAI, Mistral AI, Meta, Alibaba, and DeepSeek.
Our benchmark reveals 72,716 biased responses across the studied LLMs, with individual models yielding between 7,754 and 16,963 biased responses.
arXiv Detail & Related papers (2024-11-01T13:47:00Z) - Large Language Models Reflect the Ideology of their Creators [71.65505524599888]
Large language models (LLMs) are trained on vast amounts of data to generate natural language.<n>This paper shows that the ideological stance of an LLM appears to reflect the worldview of its creators.
arXiv Detail & Related papers (2024-10-24T04:02:30Z) - Prompt and Prejudice [29.35618753825668]
This paper investigates the impact of using first names in Large Language Models (LLMs) and Vision Language Models (VLMs)
We propose an approach that appends first names to ethically annotated text scenarios to reveal demographic biases in model outputs.
arXiv Detail & Related papers (2024-08-07T14:11:33Z) - CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models [58.57987316300529]
Large Language Models (LLMs) are increasingly deployed to handle various natural language processing (NLP) tasks.<n>To evaluate the biases exhibited by LLMs, researchers have recently proposed a variety of datasets.<n>We propose CEB, a Compositional Evaluation Benchmark that covers different types of bias across different social groups and tasks.
arXiv Detail & Related papers (2024-07-02T16:31:37Z) - Categorical Syllogisms Revisited: A Review of the Logical Reasoning Abilities of LLMs for Analyzing Categorical Syllogism [62.571419297164645]
This paper provides a systematic overview of prior works on the logical reasoning ability of large language models for analyzing categorical syllogisms.<n>We first investigate all the possible variations for the categorical syllogisms from a purely logical perspective.<n>We then examine the underlying configurations (i.e., mood and figure) tested by the existing datasets.
arXiv Detail & Related papers (2024-06-26T21:17:20Z) - A Theory of LLM Sampling: Part Descriptive and Part Prescriptive [53.08398658452411]
Large Language Models (LLMs) are increasingly utilized in autonomous decision-making.<n>We show that this sampling behavior resembles that of human decision-making.<n>We show that this deviation of a sample from the statistical norm towards a prescriptive component consistently appears in concepts across diverse real-world domains.
arXiv Detail & Related papers (2024-02-16T18:28:43Z) - STEER: Assessing the Economic Rationality of Large Language Models [21.91812661475551]
There is increasing interest in using LLMs as decision-making "agents"
determining whether an LLM agent is reliable enough to be trusted requires a methodology for assessing such an agent's economic rationality.
arXiv Detail & Related papers (2024-02-14T20:05:26Z) - CLOMO: Counterfactual Logical Modification with Large Language Models [109.60793869938534]
We introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark.
In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship.
We propose an innovative evaluation metric, the Self-Evaluation Score (SES), to directly evaluate the natural language output of LLMs.
arXiv Detail & Related papers (2023-11-29T08:29:54Z) - Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation.
We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process.
We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.