Steering Prosocial AI Agents: Computational Basis of LLM's Decision Making in Social Simulation
- URL: http://arxiv.org/abs/2504.11671v1
- Date: Wed, 16 Apr 2025 00:02:28 GMT
- Title: Steering Prosocial AI Agents: Computational Basis of LLM's Decision Making in Social Simulation
- Authors: Ji Ma,
- Abstract summary: Large language models (LLMs) increasingly serve as human-like decision-making agents in social science and applied settings.<n>This study proposes and tests methods for probing, quantifying, and modifying an LLM's internal representations in a Dictator Game.<n>Manipulating these vectors during the model's inference can substantially alter how those variables relate to the model's decision-making.
- Score: 7.504095239018173
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large language models (LLMs) increasingly serve as human-like decision-making agents in social science and applied settings. These LLM-agents are typically assigned human-like characters and placed in real-life contexts. However, how these characters and contexts shape an LLM's behavior remains underexplored. This study proposes and tests methods for probing, quantifying, and modifying an LLM's internal representations in a Dictator Game -- a classic behavioral experiment on fairness and prosocial behavior. We extract ``vectors of variable variations'' (e.g., ``male'' to ``female'') from the LLM's internal state. Manipulating these vectors during the model's inference can substantially alter how those variables relate to the model's decision-making. This approach offers a principled way to study and regulate how social concepts can be encoded and engineered within transformer-based models, with implications for alignment, debiasing, and designing AI agents for social simulations in both academic and commercial applications.
Related papers
- SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users [59.39512775126668]
We introduce SocioVerse, an agent-driven world model for social simulation.<n>Our framework features four powerful alignment components and a user pool of 10 million real individuals.<n>Results demonstrate that SocioVerse can reflect large-scale population dynamics while ensuring diversity, credibility, and representativeness.
arXiv Detail & Related papers (2025-04-14T12:12:52Z) - Are Human Interactions Replicable by Generative Agents? A Case Study on Pronoun Usage in Hierarchical Interactions [29.139828718538418]
We investigate whether interactions among Large Language Models (LLMs) agents resemble those of humans.
Our evaluation reveals the significant discrepancies between LLM-based simulations and human pronoun usage.
We urge caution in using such social simulation in practitioners' decision-making process.
arXiv Detail & Related papers (2025-01-25T17:42:47Z) - Sense and Sensitivity: Evaluating the simulation of social dynamics via Large Language Models [27.313165173789233]
Large language models have been proposed as a powerful replacement for classical agent-based models (ABMs) to simulate social dynamics.<n>However, due to the black box nature of LLMs, it is unclear whether LLM agents actually execute the intended semantics.<n>We show that while it is possible to engineer prompts that approximate the intended dynamics, the quality of these simulations is highly sensitive to the particular choice of prompts.
arXiv Detail & Related papers (2024-12-06T14:50:01Z) - Can Machines Think Like Humans? A Behavioral Evaluation of LLM-Agents in Dictator Games [7.504095239018173]
Large Language Model (LLM)-based agents increasingly undertake real-world tasks and engage with human society.<n>We investigate how different personas and experimental framings affect these AI agents' altruistic behavior in dictator games.<n>We show that assigning a human-like identity to LLMs does not produce human-like behaviors.
arXiv Detail & Related papers (2024-10-28T17:47:41Z) - PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - Characterizing Truthfulness in Large Language Model Generations with
Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs)
We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z) - A Theory of LLM Sampling: Part Descriptive and Part Prescriptive [53.08398658452411]
Large Language Models (LLMs) are increasingly utilized in autonomous decision-making.
We show that this sampling behavior resembles that of human decision-making.
We show that this deviation of a sample from the statistical norm towards a prescriptive component consistently appears in concepts across diverse real-world domains.
arXiv Detail & Related papers (2024-02-16T18:28:43Z) - LLM-driven Imitation of Subrational Behavior : Illusion or Reality? [3.2365468114603937]
Existing work highlights the ability of Large Language Models to address complex reasoning tasks and mimic human communication.
We propose to investigate the use of LLMs to generate synthetic human demonstrations, which are then used to learn subrational agent policies.
We experimentally evaluate the ability of our framework to model sub-rationality through four simple scenarios.
arXiv Detail & Related papers (2024-02-13T19:46:39Z) - Systematic Biases in LLM Simulations of Debates [12.933509143906141]
We study the limitations of Large Language Models in simulating human interactions.<n>Our findings indicate a tendency for LLM agents to conform to the model's inherent social biases.<n>These results underscore the need for further research to develop methods that help agents overcome these biases.
arXiv Detail & Related papers (2024-02-06T14:51:55Z) - Do LLM Agents Exhibit Social Behavior? [5.094340963261968]
State-Understanding-Value-Action (SUVA) is a framework to systematically analyze responses in social contexts.
It assesses social behavior through both their final decisions and the response generation processes leading to those decisions.
We demonstrate that utterance-based reasoning reliably predicts LLMs' final actions.
arXiv Detail & Related papers (2023-12-23T08:46:53Z) - CLOMO: Counterfactual Logical Modification with Large Language Models [109.60793869938534]
We introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark.
In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship.
We propose an innovative evaluation metric, the Self-Evaluation Score (SES), to directly evaluate the natural language output of LLMs.
arXiv Detail & Related papers (2023-11-29T08:29:54Z) - Training Socially Aligned Language Models on Simulated Social
Interactions [99.39979111807388]
Social alignment in AI systems aims to ensure that these models behave according to established societal values.
Current language models (LMs) are trained to rigidly replicate their training corpus in isolation.
This work presents a novel training paradigm that permits LMs to learn from simulated social interactions.
arXiv Detail & Related papers (2023-05-26T14:17:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.