Pay What LLM Wants: Can LLM Simulate Economics Experiment with 522 Real-human Persona?
- URL: http://arxiv.org/abs/2508.03262v1
- Date: Tue, 05 Aug 2025 09:37:37 GMT
- Title: Pay What LLM Wants: Can LLM Simulate Economics Experiment with 522 Real-human Persona?
- Authors: Junhyuk Choi, Hyeonchu Park, Haemin Lee, Hyebeen Shin, Hyun Joung Jin, Bugeun Kim,
- Abstract summary: We evaluate Large Language Models' ability to predict individual economic decision-making using Pay-What-You-Want pricing experiments with real 522 human personas.<n>Results reveal that while LLMs struggle with precise individual-level predictions, they demonstrate reasonable group-level behavioral tendencies.
- Score: 1.931250555574267
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in Large Language Models (LLMs) have generated significant interest in their capacity to simulate human-like behaviors, yet most studies rely on fictional personas rather than actual human data. We address this limitation by evaluating LLMs' ability to predict individual economic decision-making using Pay-What-You-Want (PWYW) pricing experiments with real 522 human personas. Our study systematically compares three state-of-the-art multimodal LLMs using detailed persona information from 522 Korean participants in cultural consumption scenarios. We investigate whether LLMs can accurately replicate individual human choices and how persona injection methods affect prediction performance. Results reveal that while LLMs struggle with precise individual-level predictions, they demonstrate reasonable group-level behavioral tendencies. Also, we found that commonly adopted prompting techniques are not much better than naive prompting methods; reconstruction of personal narrative nor retrieval augmented generation have no significant gain against simple prompting method. We believe that these findings can provide the first comprehensive evaluation of LLMs' capabilities on simulating economic behavior using real human data, offering empirical guidance for persona-based simulation in computational social science.
Related papers
- OPeRA: A Dataset of Observation, Persona, Rationale, and Action for Evaluating LLMs on Human Online Shopping Behavior Simulation [56.47029531207105]
OPERA is the first public dataset that comprehensively captures user personas, browser observations, fine-grained web actions, and self-reported just-in-time rationales.<n>We establish the first benchmark to evaluate how well current LLMs can predict a specific user's next action and rationale.
arXiv Detail & Related papers (2025-06-05T21:37:49Z) - LLM Social Simulations Are a Promising Research Method [4.6456873975541635]
We argue that the promise of LLM social simulations can be achieved by addressing five tractable challenges.<n>We believe that LLM social simulations can already be used for pilot and exploratory studies.<n>Researchers should prioritize developing conceptual models and iterative evaluations to make the best use of new AI systems.
arXiv Detail & Related papers (2025-04-03T03:01:26Z) - Prompting is Not All You Need! Evaluating LLM Agent Simulation Methodologies with Real-World Online Customer Behavior Data [62.61900377170456]
We focus on evaluating LLM's objective accuracy'' rather than the subjective believability'' in simulating human behavior.<n>We present the first comprehensive evaluation of state-of-the-art LLMs on the task of web shopping action generation.
arXiv Detail & Related papers (2025-03-26T17:33:27Z) - LLM Generated Persona is a Promise with a Catch [18.45442859688198]
Persona-based simulations hold promise for transforming disciplines that rely on population-level feedback.<n>Traditional methods to collect realistic persona data face challenges.<n>They are prohibitively expensive and logistically challenging due to privacy constraints.
arXiv Detail & Related papers (2025-03-18T03:11:27Z) - Can LLMs Simulate Social Media Engagement? A Study on Action-Guided Response Generation [51.44040615856536]
This paper analyzes large language models' ability to simulate social media engagement through action guided response generation.<n>We benchmark GPT-4o-mini, O1-mini, and DeepSeek-R1 in social media engagement simulation regarding a major societal event.
arXiv Detail & Related papers (2025-02-17T17:43:08Z) - Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina [7.155982875107922]
Studies suggest large language models (LLMs) can exhibit human-like reasoning, aligning with human behavior in economic experiments, surveys, and political discourse.<n>This has led many to propose that LLMs can be used as surrogates or simulations for humans in social science research.<n>We assess the reasoning depth of LLMs using the 11-20 money request game.
arXiv Detail & Related papers (2024-10-25T14:46:07Z) - Agentic Society: Merging skeleton from real world and texture from Large Language Model [4.740886789811429]
This paper explores a novel framework that leverages census data and large language models to generate virtual populations.
We show that our method produces personas with variability essential for simulating diverse human behaviors in social science experiments.
But the evaluation result shows that only weak sign of statistical truthfulness can be produced due to limited capability of current LLMs.
arXiv Detail & Related papers (2024-09-02T08:28:19Z) - Character is Destiny: Can Role-Playing Language Agents Make Persona-Driven Decisions? [59.0123596591807]
We benchmark the ability of Large Language Models (LLMs) in persona-driven decision-making.
We investigate whether LLMs can predict characters' decisions provided by the preceding stories in high-quality novels.
The results demonstrate that state-of-the-art LLMs exhibit promising capabilities in this task, yet substantial room for improvement remains.
arXiv Detail & Related papers (2024-04-18T12:40:59Z) - Can LLMs Replace Economic Choice Prediction Labs? The Case of Language-based Persuasion Games [22.01549425007543]
We show that trained models can effectively predict human behavior in language-based persuasion games.
Our experiments show that models trained on LLM-generated data can even outperform models trained on actual human data.
arXiv Detail & Related papers (2024-01-30T20:49:47Z) - Do LLMs exhibit human-like response biases? A case study in survey
design [66.1850490474361]
We investigate the extent to which large language models (LLMs) reflect human response biases, if at all.
We design a dataset and framework to evaluate whether LLMs exhibit human-like response biases in survey questionnaires.
Our comprehensive evaluation of nine models shows that popular open and commercial LLMs generally fail to reflect human-like behavior.
arXiv Detail & Related papers (2023-11-07T15:40:43Z) - Can ChatGPT Assess Human Personalities? A General Evaluation Framework [70.90142717649785]
Large Language Models (LLMs) have produced impressive results in various areas, but their potential human-like psychology is still largely unexplored.
This paper presents a generic evaluation framework for LLMs to assess human personalities based on Myers Briggs Type Indicator (MBTI) tests.
arXiv Detail & Related papers (2023-03-01T06:16:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.