Related papers: Position: Simulating Society Requires Simulating Thought

Position: Simulating Society Requires Simulating Thought

URL: http://arxiv.org/abs/2506.06958v1
Date: Sun, 08 Jun 2025 00:59:02 GMT
Title: Position: Simulating Society Requires Simulating Thought
Authors: Chance Jiajie Li, Jiayi Wu, Zhenze Mo, Ao Qu, Yuhan Tang, Kaiya Ivy Zhao, Yulu Gan, Jie Fan, Jiangbo Yu, Jinhua Zhao, Paul Liang, Luis Alonso, Kent Larson,
Abstract summary: Simulating society with large language models (LLMs) requires cognitively grounded reasoning that is structured, revisable, and traceable.<n>We present a conceptual modeling paradigm, Generative Minds (GenMinds), which draws from cognitive science to support structured belief representations in generative agents.<n>These contributions advance a broader shift: from surface-level mimicry to generative agents that simulate thought -- not just language -- for social simulations.
Score: 9.150119344618497
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Simulating society with large language models (LLMs), we argue, requires more than generating plausible behavior -- it demands cognitively grounded reasoning that is structured, revisable, and traceable. LLM-based agents are increasingly used to emulate individual and group behavior -- primarily through prompting and supervised fine-tuning. Yet they often lack internal coherence, causal reasoning, and belief traceability -- making them unreliable for analyzing how people reason, deliberate, or respond to interventions. To address this, we present a conceptual modeling paradigm, Generative Minds (GenMinds), which draws from cognitive science to support structured belief representations in generative agents. To evaluate such agents, we introduce the RECAP (REconstructing CAusal Paths) framework, a benchmark designed to assess reasoning fidelity via causal traceability, demographic grounding, and intervention consistency. These contributions advance a broader shift: from surface-level mimicry to generative agents that simulate thought -- not just language -- for social simulations.

Related papers

LLM-Based Social Simulations Require a Boundary [3.351170542925928]
This position paper argues that large language model (LLM)-based social simulations should establish clear boundaries.<n>We examine three key boundary problems: alignment (simulated behaviors matching real-world patterns), consistency (maintaining coherent agent behavior over time), and robustness.
arXiv Detail & Related papers (2025-06-24T17:14:47Z)
The Traitors: Deception and Trust in Multi-Agent Language Model Simulations [0.0]
We introduce The Traitors, a multi-agent simulation framework inspired by social deduction games.<n>We develop a suite of evaluation metrics capturing deception success, trust dynamics, and collective inference quality.<n>Our initial experiments across DeepSeek-V3, GPT-4o-mini, and GPT-4o (10 runs per model) reveal a notable asymmetry.
arXiv Detail & Related papers (2025-05-19T10:01:35Z)
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models [76.6028674686018]
We introduce thought-tracing, an inference-time reasoning algorithm to trace the mental states of agents.<n>Our algorithm is modeled after the Bayesian theory-of-mind framework.<n>We evaluate thought-tracing on diverse theory-of-mind benchmarks, demonstrating significant performance improvements.
arXiv Detail & Related papers (2025-02-17T15:08:50Z)
Causality can systematically address the monsters under the bench(marks) [64.36592889550431]
Benchmarks are plagued by various biases, artifacts, or leakage.<n>Models may behave unreliably due to poorly explored failure modes.<n> causality offers an ideal framework to systematically address these challenges.
arXiv Detail & Related papers (2025-02-07T17:01:37Z)
Large Language Models as Theory of Mind Aware Generative Agents with Counterfactual Reflection [31.38516078163367]
ToM-agent is designed to empower LLMs-based generative agents to simulate ToM in open-domain conversational interactions.<n>ToM-agent disentangles the confidence from mental states, facilitating the emulation of an agent's perception of its counterpart's mental states.<n>Our findings indicate that the ToM-agent can grasp the underlying reasons for their counterpart's behaviors beyond mere semantic-emotional supporting or decision-making based on common sense.
arXiv Detail & Related papers (2025-01-26T00:32:38Z)
Emergence of human-like polarization among large language model agents [79.96817421756668]
We simulate a networked system involving thousands of large language model agents, discovering their social interactions, result in human-like polarization.<n>Similarities between humans and LLM agents raise concerns about their capacity to amplify societal polarization, but also hold the potential to serve as a valuable testbed for identifying plausible strategies to mitigate polarization and its consequences.
arXiv Detail & Related papers (2025-01-09T11:45:05Z)
Generative Agent Simulations of 1,000 People [56.82159813294894]
We present a novel agent architecture that simulates the attitudes and behaviors of 1,052 real individuals. The generative agents replicate participants' responses on the General Social Survey 85% as accurately as participants replicate their own answers. Our architecture reduces accuracy biases across racial and ideological groups compared to agents given demographic descriptions.
arXiv Detail & Related papers (2024-11-15T11:14:34Z)
Failure Modes of LLMs for Causal Reasoning on Narratives [51.19592551510628]
We investigate the interaction between world knowledge and logical reasoning.<n>We find that state-of-the-art large language models (LLMs) often rely on superficial generalizations.<n>We show that simple reformulations of the task can elicit more robust reasoning behavior.
arXiv Detail & Related papers (2024-10-31T12:48:58Z)
Shall We Team Up: Exploring Spontaneous Cooperation of Competing LLM Agents [18.961470450132637]
This paper emphasizes the importance of spontaneous phenomena, wherein agents deeply engage in contexts and make adaptive decisions without explicit directions. We explored spontaneous cooperation across three competitive scenarios and successfully simulated the gradual emergence of cooperation.
arXiv Detail & Related papers (2024-02-19T18:00:53Z)
How Far Are LLMs from Believable AI? A Benchmark for Evaluating the Believability of Human Behavior Simulation [46.42384207122049]
We design SimulateBench to evaluate the believability of large language models (LLMs) when simulating human behaviors. Based on SimulateBench, we evaluate the performances of 10 widely used LLMs when simulating characters.
arXiv Detail & Related papers (2023-12-28T16:51:11Z)
Simulating Opinion Dynamics with Networks of LLM-based Agents [7.697132934635411]
We propose a new approach to simulating opinion dynamics based on populations of Large Language Models (LLMs) Our findings reveal a strong inherent bias in LLM agents towards producing accurate information, leading simulated agents to consensus in line with scientific reality. After inducing confirmation bias through prompt engineering, however, we observed opinion fragmentation in line with existing agent-based modeling and opinion dynamics research.
arXiv Detail & Related papers (2023-11-16T07:01:48Z)
From Heuristic to Analytic: Cognitively Motivated Strategies for Coherent Physical Commonsense Reasoning [66.98861219674039]
Heuristic-Analytic Reasoning (HAR) strategies drastically improve the coherence of rationalizations for model decisions. Our findings suggest that human-like reasoning strategies can effectively improve the coherence and reliability of PLM reasoning.
arXiv Detail & Related papers (2023-10-24T19:46:04Z)
CausalCity: Complex Simulations with Agency for Causal Discovery and Reasoning [68.74447489372037]
We present a high-fidelity simulation environment that is designed for developing algorithms for causal discovery and counterfactual reasoning. A core component of our work is to introduce textitagency, such that it is simple to define and create complex scenarios. We perform experiments with three state-of-the-art methods to create baselines and highlight the affordances of this environment.
arXiv Detail & Related papers (2021-06-25T00:21:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.