Related papers: Are Human Interactions Replicable by Generative Agents? A Case Study on Pronoun Usage in Hierarchical Interactions

Are Human Interactions Replicable by Generative Agents? A Case Study on Pronoun Usage in Hierarchical Interactions

URL: http://arxiv.org/abs/2501.15283v1
Date: Sat, 25 Jan 2025 17:42:47 GMT
Title: Are Human Interactions Replicable by Generative Agents? A Case Study on Pronoun Usage in Hierarchical Interactions
Authors: Naihao Deng, Rada Mihalcea,
Abstract summary: We investigate whether interactions among Large Language Models (LLMs) agents resemble those of humans.<n>Our evaluation reveals the significant discrepancies between LLM-based simulations and human pronoun usage.<n>We urge caution in using such social simulation in practitioners' decision-making process.
Score: 29.139828718538418
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: As Large Language Models (LLMs) advance in their capabilities, researchers have increasingly employed them for social simulation. In this paper, we investigate whether interactions among LLM agents resemble those of humans. Specifically, we focus on the pronoun usage difference between leaders and non-leaders, examining whether the simulation would lead to human-like pronoun usage patterns during the LLMs' interactions. Our evaluation reveals the significant discrepancies between LLM-based simulations and human pronoun usage, with prompt-based or specialized agents failing to demonstrate human-like pronoun usage patterns. In addition, we reveal that even if LLMs understand the human pronoun usage patterns, they fail to demonstrate them in the actual interaction process. Our study highlights the limitations of social simulations based on LLM agents, urging caution in using such social simulation in practitioners' decision-making process.

Related papers

Leveraging LLM-based agents for social science research: insights from citation network simulations [132.4334196445918]
We introduce the CiteAgent framework, designed to generate citation networks based on human-behavior simulation.<n>CiteAgent captures predominant phenomena in real-world citation networks, including power-law distribution, citational distortion, and shrinking diameter.<n>We establish two LLM-based research paradigms in social science, allowing us to validate and challenge existing theories.
arXiv Detail & Related papers (2025-11-05T08:47:04Z)
Social Simulations with Large Language Model Risk Utopian Illusion [61.358959720048354]
We introduce a systematic framework for analyzing large language models' behavior in social simulation.<n>Our approach simulates multi-agent interactions through chatroom-style conversations and analyzes them across five linguistic dimensions.<n>Our findings reveal that LLMs do not faithfully reproduce genuine human behavior but instead reflect overly idealized versions of it.
arXiv Detail & Related papers (2025-10-24T06:08:41Z)
How large language models judge and influence human cooperation [82.07571393247476]
We assess how state-of-the-art language models judge cooperative actions.<n>We observe a remarkable agreement in evaluating cooperation against good opponents.<n>We show that the differences revealed between models can significantly impact the prevalence of cooperation.
arXiv Detail & Related papers (2025-06-30T09:14:42Z)
Don't Trust Generative Agents to Mimic Communication on Social Networks Unless You Benchmarked their Empirical Realism [1.734165485480267]
We focus on replicating the behavior of social network users with the use of Large Language Models.<n>We empirically test different approaches to imitate user behavior on X in English and German.<n>Our findings suggest that social simulations should be validated by their empirical realism measured in the setting in which the simulation components were fitted.
arXiv Detail & Related papers (2025-06-27T07:32:16Z)
SocialEval: Evaluating Social Intelligence of Large Language Models [70.90981021629021]
Social Intelligence (SI) equips humans with interpersonal abilities to behave wisely in navigating social interactions to achieve social goals.<n>This presents an operational evaluation paradigm: outcome-oriented goal achievement evaluation and process-oriented interpersonal ability evaluation.<n>We propose SocialEval, a script-based bilingual SI benchmark, integrating outcome- and process-oriented evaluation by manually crafting narrative scripts.
arXiv Detail & Related papers (2025-06-01T08:36:51Z)
Steering Prosocial AI Agents: Computational Basis of LLM's Decision Making in Social Simulation [7.504095239018173]
Large language models (LLMs) increasingly serve as human-like decision-making agents in social science and applied settings. This study proposes and tests methods for probing, quantifying, and modifying an LLM's internal representations in a Dictator Game. Manipulating these vectors during the model's inference can substantially alter how those variables relate to the model's decision-making.
arXiv Detail & Related papers (2025-04-16T00:02:28Z)
If an LLM Were a Character, Would It Know Its Own Story? Evaluating Lifelong Learning in LLMs [55.8331366739144]
We introduce LIFESTATE-BENCH, a benchmark designed to assess lifelong learning in large language models (LLMs) Our fact checking evaluation probes models' self-awareness, episodic memory retrieval, and relationship tracking, across both parametric and non-parametric approaches.
arXiv Detail & Related papers (2025-03-30T16:50:57Z)
Spontaneous Emergence of Agent Individuality through Social Interactions in LLM-Based Communities [0.0]
We study the emergence of agency from scratch by using Large Language Model (LLM)-based agents. By analyzing this multi-agent simulation, we report valuable new insights into how social norms, cooperation, and personality traits can emerge spontaneously.
arXiv Detail & Related papers (2024-11-05T16:49:33Z)
Real or Robotic? Assessing Whether LLMs Accurately Simulate Qualities of Human Responses in Dialogue [25.89926022671521]
We generate a large-scale dataset of 100,000 paired LLM-LLM and human-LLM dialogues from the WildChat dataset. We find relatively low alignment between simulations and human interactions, demonstrating a systematic divergence along the multiple textual properties.
arXiv Detail & Related papers (2024-09-12T18:00:18Z)
PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development. We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z)
LLM-driven Imitation of Subrational Behavior : Illusion or Reality? [3.2365468114603937]
Existing work highlights the ability of Large Language Models to address complex reasoning tasks and mimic human communication. We propose to investigate the use of LLMs to generate synthetic human demonstrations, which are then used to learn subrational agent policies. We experimentally evaluate the ability of our framework to model sub-rationality through four simple scenarios.
arXiv Detail & Related papers (2024-02-13T19:46:39Z)
Systematic Biases in LLM Simulations of Debates [12.933509143906141]
We study the limitations of Large Language Models in simulating human interactions.<n>Our findings indicate a tendency for LLM agents to conform to the model's inherent social biases.<n>These results underscore the need for further research to develop methods that help agents overcome these biases.
arXiv Detail & Related papers (2024-02-06T14:51:55Z)
Do LLM Agents Exhibit Social Behavior? [5.094340963261968]
State-Understanding-Value-Action (SUVA) is a framework to systematically analyze responses in social contexts. It assesses social behavior through both their final decisions and the response generation processes leading to those decisions. We demonstrate that utterance-based reasoning reliably predicts LLMs' final actions.
arXiv Detail & Related papers (2023-12-23T08:46:53Z)
Divergences between Language Models and Human Brains [59.100552839650774]
We systematically explore the divergences between human and machine language processing.<n>We identify two domains that LMs do not capture well: social/emotional intelligence and physical commonsense.<n>Our results show that fine-tuning LMs on these domains can improve their alignment with human brain responses.
arXiv Detail & Related papers (2023-11-15T19:02:40Z)
Do LLMs exhibit human-like response biases? A case study in survey design [66.1850490474361]
We investigate the extent to which large language models (LLMs) reflect human response biases, if at all. We design a dataset and framework to evaluate whether LLMs exhibit human-like response biases in survey questionnaires. Our comprehensive evaluation of nine models shows that popular open and commercial LLMs generally fail to reflect human-like behavior.
arXiv Detail & Related papers (2023-11-07T15:40:43Z)
CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations [61.9212914612875]
We present a framework to characterize LLM simulations using four dimensions: Context, Model, Persona, and Topic. We use this framework to measure open-ended LLM simulations' susceptibility to caricature, defined via two criteria: individuation and exaggeration. We find that for GPT-4, simulations of certain demographics (political and marginalized groups) and topics (general, uncontroversial) are highly susceptible to caricature.
arXiv Detail & Related papers (2023-10-17T18:00:25Z)
AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems [112.76941157194544]
We propose AgentCF for simulating user-item interactions in recommender systems through agent-based collaborative filtering. We creatively consider not only users but also items as agents, and develop a collaborative learning approach that optimize both kinds of agents together. Overall, the optimized agents exhibit diverse interaction behaviors within our framework, including user-item, user-user, item-item, and collective interactions.
arXiv Detail & Related papers (2023-10-13T16:37:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.