Related papers: LLM-driven Imitation of Subrational Behavior : Illusion or Reality?

LLM-driven Imitation of Subrational Behavior : Illusion or Reality?

URL: http://arxiv.org/abs/2402.08755v1
Date: Tue, 13 Feb 2024 19:46:39 GMT
Title: LLM-driven Imitation of Subrational Behavior : Illusion or Reality?
Authors: Andrea Coletta, Kshama Dwarakanath, Penghang Liu, Svitlana Vyetrenko, Tucker Balch
Abstract summary: Existing work highlights the ability of Large Language Models to address complex reasoning tasks and mimic human communication. We propose to investigate the use of LLMs to generate synthetic human demonstrations, which are then used to learn subrational agent policies. We experimentally evaluate the ability of our framework to model sub-rationality through four simple scenarios.
Score: 3.2365468114603937
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modeling subrational agents, such as humans or economic households, is inherently challenging due to the difficulty in calibrating reinforcement learning models or collecting data that involves human subjects. Existing work highlights the ability of Large Language Models (LLMs) to address complex reasoning tasks and mimic human communication, while simulation using LLMs as agents shows emergent social behaviors, potentially improving our comprehension of human conduct. In this paper, we propose to investigate the use of LLMs to generate synthetic human demonstrations, which are then used to learn subrational agent policies though Imitation Learning. We make an assumption that LLMs can be used as implicit computational models of humans, and propose a framework to use synthetic demonstrations derived from LLMs to model subrational behaviors that are characteristic of humans (e.g., myopic behavior or preference for risk aversion). We experimentally evaluate the ability of our framework to model sub-rationality through four simple scenarios, including the well-researched ultimatum game and marshmallow experiment. To gain confidence in our framework, we are able to replicate well-established findings from prior human studies associated with the above scenarios. We conclude by discussing the potential benefits, challenges and limitations of our framework.

Related papers

SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users [70.02370111025617]
We introduce SocioVerse, an agent-driven world model for social simulation. Our framework features four powerful alignment components and a user pool of 10 million real individuals. Results demonstrate that SocioVerse can reflect large-scale population dynamics while ensuring diversity, credibility, and representativeness.
arXiv Detail & Related papers (2025-04-14T12:12:52Z)
From ChatGPT to DeepSeek: Can LLMs Simulate Humanity? [32.93460040317926]
Large Language Models (LLMs) have become a promising method for exploring complex human social behaviors. Recent studies highlight discrepancies between simulated and real-world interactions.
arXiv Detail & Related papers (2025-02-25T13:54:47Z)
Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina [7.155982875107922]
Studies suggest large language models (LLMs) can exhibit human-like reasoning, aligning with human behavior in economic experiments, surveys, and political discourse. This has led many to propose that LLMs can be used as surrogates or simulations for humans in social science research. We assess the reasoning depth of LLMs using the 11-20 money request game.
arXiv Detail & Related papers (2024-10-25T14:46:07Z)
Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making. Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations. Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z)
PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development. We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z)
WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks [85.95607119635102]
Large language models (LLMs) can mimic human-like intelligence. WorkArena++ is designed to evaluate the planning, problem-solving, logical/arithmetic reasoning, retrieval, and contextual understanding abilities of web agents.
arXiv Detail & Related papers (2024-07-07T07:15:49Z)
Large Language Models Need Consultants for Reasoning: Becoming an Expert in a Complex Human System Through Behavior Simulation [5.730580726163518]
Large language models (LLMs) have demonstrated remarkable capabilities comparable to humans in fields such as mathematics, law, coding, common sense, and world knowledge. We propose a novel reasoning framework, termed Mosaic Expert Observation Wall'' (MEOW) exploiting generative-agents-based simulation technique.
arXiv Detail & Related papers (2024-03-27T03:33:32Z)
Human Simulacra: Benchmarking the Personification of Large Language Models [38.21708264569801]
Large language models (LLMs) are recognized as systems that closely mimic aspects of human intelligence. This paper introduces a framework for constructing virtual characters' life stories from the ground up. Experimental results demonstrate that our constructed simulacra can produce personified responses that align with their target characters.
arXiv Detail & Related papers (2024-02-28T09:11:14Z)
Systematic Biases in LLM Simulations of Debates [12.933509143906141]
We study the limitations of Large Language Models in simulating human interactions. Our findings indicate a tendency for LLM agents to conform to the model's inherent social biases. These results underscore the need for further research to develop methods that help agents overcome these biases.
arXiv Detail & Related papers (2024-02-06T14:51:55Z)
CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations [61.9212914612875]
We present a framework to characterize LLM simulations using four dimensions: Context, Model, Persona, and Topic. We use this framework to measure open-ended LLM simulations' susceptibility to caricature, defined via two criteria: individuation and exaggeration. We find that for GPT-4, simulations of certain demographics (political and marginalized groups) and topics (general, uncontroversial) are highly susceptible to caricature.
arXiv Detail & Related papers (2023-10-17T18:00:25Z)
SALMON: Self-Alignment with Instructable Reward Models [80.83323636730341]
This paper presents a novel approach, namely SALMON, to align base language models with minimal human supervision. We develop an AI assistant named Dromedary-2 with only 6 exemplars for in-context learning and 31 human-defined principles.
arXiv Detail & Related papers (2023-10-09T17:56:53Z)
User Behavior Simulation with Large Language Model based Agents [116.74368915420065]
We propose an LLM-based agent framework and design a sandbox environment to simulate real user behaviors. Based on extensive experiments, we find that the simulated behaviors of our method are very close to the ones of real humans.
arXiv Detail & Related papers (2023-06-05T02:58:35Z)
Large Language Models as Zero-Shot Human Models for Human-Robot Interaction [12.455647753787442]
Large-language models (LLMs) can act as zero-shot human models for human-robot interaction. LLMs achieve performance comparable to purpose-built models. We present one case study on a simulated trust-based table-clearing task.
arXiv Detail & Related papers (2023-03-06T23:16:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.