Related papers: Can LLMs Replace Economic Choice Prediction Labs? The Case of Language-based Persuasion Games

Related papers

Emulating Aggregate Human Choice Behavior and Biases with GPT Conversational Agents [0.48439699124726004]
Large language models (LLMs) have been shown to reproduce well-known biases.<n>We adapted three well-established decision scenarios into a conversational setting and conducted a human experiment.<n>We found notable differences between models in how they aligned human behavior.
arXiv Detail & Related papers (2026-02-05T12:33:05Z)
Beyond Survival: Evaluating LLMs in Social Deduction Games with Human-Aligned Strategies [54.08697738311866]
Social deduction games like Werewolf combine language, reasoning, and strategy.<n>We curate a high-quality, human-verified multimodal Werewolf dataset containing over 100 hours of video, 32.4M utterance tokens, and 15 rule variants.<n>We propose a novel strategy-alignment evaluation that leverages the winning faction's strategies as ground truth in two stages.
arXiv Detail & Related papers (2025-10-13T13:33:30Z)
Pay What LLM Wants: Can LLM Simulate Economics Experiment with 522 Real-human Persona? [1.931250555574267]
We evaluate Large Language Models' ability to predict individual economic decision-making using Pay-What-You-Want pricing experiments with real 522 human personas.<n>Results reveal that while LLMs struggle with precise individual-level predictions, they demonstrate reasonable group-level behavioral tendencies.
arXiv Detail & Related papers (2025-08-05T09:37:37Z)
Can LLMs effectively provide game-theoretic-based scenarios for cybersecurity? [51.96049148869987]
Large Language Models (LLMs) offer new tools and challenges for the security of computer systems.<n>We investigate whether classical game-theoretic frameworks can effectively capture the behaviours of LLM-driven actors and bots.
arXiv Detail & Related papers (2025-08-04T08:57:14Z)
Addressing Bias in LLMs: Strategies and Application to Fair AI-based Recruitment [49.81946749379338]
This work seeks to analyze the capacity of Transformers-based systems to learn demographic biases present in the data.<n>We propose a privacy-enhancing framework to reduce gender information from the learning pipeline as a way to mitigate biased behaviors in the final tools.
arXiv Detail & Related papers (2025-06-13T15:29:43Z)
Using Large Language Models for Expert Prior Elicitation in Predictive Modelling [53.54623137152208]
This study proposes using large language models (LLMs) to elicit expert prior distributions for predictive models. We compare LLM-elicited and uninformative priors, evaluate whether LLMs truthfully generate parameter distributions, and propose a model selection strategy for in-context learning and prior elicitation. Our findings show that LLM-elicited prior parameter distributions significantly reduce predictive error compared to uninformative priors in low-data settings.
arXiv Detail & Related papers (2024-11-26T10:13:39Z)
NewsInterview: a Dataset and a Playground to Evaluate LLMs' Ground Gap via Informational Interviews [65.35458530702442]
We focus on journalistic interviews, a domain rich in grounding communication and abundant in data. We curate a dataset of 40,000 two-person informational interviews from NPR and CNN. LLMs are significantly less likely than human interviewers to use acknowledgements and to pivot to higher-level questions.
arXiv Detail & Related papers (2024-11-21T01:37:38Z)
Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina [7.155982875107922]
Studies suggest large language models (LLMs) can exhibit human-like reasoning, aligning with human behavior in economic experiments, surveys, and political discourse. This has led many to propose that LLMs can be used as surrogates or simulations for humans in social science research. We assess the reasoning depth of LLMs using the 11-20 money request game.
arXiv Detail & Related papers (2024-10-25T14:46:07Z)
GLEE: A Unified Framework and Benchmark for Language-based Economic Environments [19.366120861935105]
Large Language Models (LLMs) show significant potential in economic and strategic interactions. These questions become crucial concerning the economic and societal implications of integrating LLM-based agents into real-world data-driven systems. We introduce a benchmark for standardizing research on two-player, sequential, language-based games.
arXiv Detail & Related papers (2024-10-07T17:55:35Z)
LLM economicus? Mapping the Behavioral Biases of LLMs via Utility Theory [20.79199807796242]
Utility theory is an approach to evaluate the economic biases of large language models. We find that the economic behavior of current LLMs is neither entirely human-like nor entirely economicus-like.
arXiv Detail & Related papers (2024-08-05T19:00:43Z)
LLM-Select: Feature Selection with Large Language Models [64.5099482021597]
Large language models (LLMs) are capable of selecting the most predictive features, with performance rivaling the standard tools of data science. Our findings suggest that LLMs may be useful not only for selecting the best features for training but also for deciding which features to collect in the first place.
arXiv Detail & Related papers (2024-07-02T22:23:40Z)
EconNLI: Evaluating Large Language Models on Economics Reasoning [22.754757518792395]
Large Language Models (LLMs) are widely used for writing economic analysis reports or providing financial advice. We propose a new dataset, natural language inference on economic events (EconNLI), to evaluate LLMs' knowledge and reasoning abilities in the economic domain. Our experiments reveal that LLMs are not sophisticated in economic reasoning and may generate wrong or hallucinated answers.
arXiv Detail & Related papers (2024-07-01T11:58:24Z)
Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice [4.029252551781513]
We propose a novel way to enhance the utility of Large Language Models as cognitive models. We show that an LLM pretrained on an ecologically valid arithmetic dataset, predicts human behavior better than many traditional cognitive models.
arXiv Detail & Related papers (2024-05-29T17:37:14Z)
Character is Destiny: Can Role-Playing Language Agents Make Persona-Driven Decisions? [59.0123596591807]
We benchmark the ability of Large Language Models (LLMs) in persona-driven decision-making. We investigate whether LLMs can predict characters' decisions provided by the preceding stories in high-quality novels. The results demonstrate that state-of-the-art LLMs exhibit promising capabilities in this task, yet substantial room for improvement remains.
arXiv Detail & Related papers (2024-04-18T12:40:59Z)
Large language models surpass human experts in predicting neuroscience results [60.26891446026707]
Large language models (LLMs) forecast novel results better than human experts. BrainBench is a benchmark for predicting neuroscience results. Our approach is not neuroscience-specific and is transferable to other knowledge-intensive endeavors.
arXiv Detail & Related papers (2024-03-04T15:27:59Z)
Limits of Large Language Models in Debating Humans [0.0]
Large Language Models (LLMs) have shown remarkable promise in their ability to interact proficiently with humans. This paper endeavors to test the limits of current-day LLMs with a pre-registered study integrating real people with LLM agents acting as people.
arXiv Detail & Related papers (2024-02-06T03:24:27Z)
Harnessing the Power of LLMs: Evaluating Human-AI Text Co-Creation through the Lens of News Headline Generation [58.31430028519306]
This study explores how humans can best leverage LLMs for writing and how interacting with these models affects feelings of ownership and trust in the writing process. While LLMs alone can generate satisfactory news headlines, on average, human control is needed to fix undesirable model outputs.
arXiv Detail & Related papers (2023-10-16T15:11:01Z)
Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models [51.3422222472898]
We document the capability of large language models (LLMs) like ChatGPT to predict stock price movements using news headlines. We develop a theoretical model incorporating information capacity constraints, underreaction, limits-to-arbitrage, and LLMs.
arXiv Detail & Related papers (2023-04-15T19:22:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.