Related papers: Systematic Biases in LLM Simulations of Debates

Systematic Biases in LLM Simulations of Debates

URL: http://arxiv.org/abs/2402.04049v2
Date: Sat, 28 Sep 2024 11:27:06 GMT
Title: Systematic Biases in LLM Simulations of Debates
Authors: Amir Taubenfeld, Yaniv Dover, Roi Reichart, Ariel Goldstein,
Abstract summary: We study the limitations of Large Language Models in simulating human interactions. Our findings indicate a tendency for LLM agents to conform to the model's inherent social biases. These results underscore the need for further research to develop methods that help agents overcome these biases.
Score: 12.933509143906141
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The emergence of Large Language Models (LLMs), has opened exciting possibilities for constructing computational simulations designed to replicate human behavior accurately. Current research suggests that LLM-based agents become increasingly human-like in their performance, sparking interest in using these AI agents as substitutes for human participants in behavioral studies. However, LLMs are complex statistical learners without straightforward deductive rules, making them prone to unexpected behaviors. Hence, it is crucial to study and pinpoint the key behavioral distinctions between humans and LLM-based agents. In this study, we highlight the limitations of LLMs in simulating human interactions, particularly focusing on LLMs' ability to simulate political debates on topics that are important aspects of people's day-to-day lives and decision-making processes. Our findings indicate a tendency for LLM agents to conform to the model's inherent social biases despite being directed to debate from certain political perspectives. This tendency results in behavioral patterns that seem to deviate from well-established social dynamics among humans. We reinforce these observations using an automatic self-fine-tuning method, which enables us to manipulate the biases within the LLM and demonstrate that agents subsequently align with the altered biases. These results underscore the need for further research to develop methods that help agents overcome these biases, a critical step toward creating more realistic simulations.

Related papers

Modeling Earth-Scale Human-Like Societies with One Billion Agents [54.465233996410156]
Light Society is an agent-based simulation framework.<n>It formalizes social processes as structured transitions of agent and environment states.<n>It supports efficient simulation of societies with over one billion agents.
arXiv Detail & Related papers (2025-06-07T09:14:12Z)
Empowering Economic Simulation for Massively Multiplayer Online Games through Generative Agent-Based Modeling [53.26311872828166]
We take a preliminary step in introducing a novel approach using Large Language Models (LLMs) in MMO economy simulation.<n>We design LLM-driven agents with human-like decision-making and adaptability.<n>These agents are equipped with the abilities of role-playing, perception, memory, and reasoning, addressing the aforementioned challenges effectively.
arXiv Detail & Related papers (2025-06-05T07:21:13Z)
Comparing Exploration-Exploitation Strategies of LLMs and Humans: Insights from Standard Multi-armed Bandit Tasks [6.355245936740126]
Large language models (LLMs) are increasingly used to simulate or automate human behavior in sequential decision-making tasks.<n>We focus on the exploration-exploitation (E&E) tradeoff, a fundamental aspect of dynamic decision-making under uncertainty.<n>We find that reasoning shifts LLMs toward more human-like behavior, characterized by a mix of random and directed exploration.
arXiv Detail & Related papers (2025-05-15T02:09:18Z)
Prompting is Not All You Need! Evaluating LLM Agent Simulation Methodologies with Real-World Online Customer Behavior Data [62.61900377170456]
We focus on evaluating LLM's objective accuracy'' rather than the subjective believability'' in simulating human behavior.<n>We present the first comprehensive evaluation of state-of-the-art LLMs on the task of web shopping action generation.
arXiv Detail & Related papers (2025-03-26T17:33:27Z)
From ChatGPT to DeepSeek: Can LLMs Simulate Humanity? [32.93460040317926]
Large Language Models (LLMs) have become a promising method for exploring complex human social behaviors. Recent studies highlight discrepancies between simulated and real-world interactions.
arXiv Detail & Related papers (2025-02-25T13:54:47Z)
Persuasion with Large Language Models: a Survey [49.86930318312291]
Large Language Models (LLMs) have created new disruptive possibilities for persuasive communication. In areas such as politics, marketing, public health, e-commerce, and charitable giving, such LLM Systems have already achieved human-level or even super-human persuasiveness. Our survey suggests that the current and future potential of LLM-based persuasion poses profound ethical and societal risks.
arXiv Detail & Related papers (2024-11-11T10:05:52Z)
Causality for Large Language Models [37.10970529459278]
Large language models (LLMs) with billions or trillions of parameters are trained on vast datasets, achieving unprecedented success across a series of language tasks. Recent research highlights that LLMs function as causal parrots, capable of reciting causal knowledge without truly understanding or applying it. This survey aims to explore how causality can enhance LLMs at every stage of their lifecycle.
arXiv Detail & Related papers (2024-10-20T07:22:23Z)
Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making. Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations. Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z)
PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development. We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z)
Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice [4.029252551781513]
We propose a novel way to enhance the utility of Large Language Models as cognitive models. We show that an LLM pretrained on an ecologically valid arithmetic dataset, predicts human behavior better than many traditional cognitive models.
arXiv Detail & Related papers (2024-05-29T17:37:14Z)
Explaining Large Language Models Decisions Using Shapley Values [1.223779595809275]
Large language models (LLMs) have opened up exciting possibilities for simulating human behavior and cognitive processes. However, the validity of utilizing LLMs as stand-ins for human subjects remains uncertain. This paper presents a novel approach based on Shapley values to interpret LLM behavior and quantify the relative contribution of each prompt component to the model's output.
arXiv Detail & Related papers (2024-03-29T22:49:43Z)
LLM-driven Imitation of Subrational Behavior : Illusion or Reality? [3.2365468114603937]
Existing work highlights the ability of Large Language Models to address complex reasoning tasks and mimic human communication. We propose to investigate the use of LLMs to generate synthetic human demonstrations, which are then used to learn subrational agent policies. We experimentally evaluate the ability of our framework to model sub-rationality through four simple scenarios.
arXiv Detail & Related papers (2024-02-13T19:46:39Z)
How Far Are LLMs from Believable AI? A Benchmark for Evaluating the Believability of Human Behavior Simulation [46.42384207122049]
We design SimulateBench to evaluate the believability of large language models (LLMs) when simulating human behaviors. Based on SimulateBench, we evaluate the performances of 10 widely used LLMs when simulating characters.
arXiv Detail & Related papers (2023-12-28T16:51:11Z)
Simulating Opinion Dynamics with Networks of LLM-based Agents [7.697132934635411]
We propose a new approach to simulating opinion dynamics based on populations of Large Language Models (LLMs) Our findings reveal a strong inherent bias in LLM agents towards producing accurate information, leading simulated agents to consensus in line with scientific reality. After inducing confirmation bias through prompt engineering, however, we observed opinion fragmentation in line with existing agent-based modeling and opinion dynamics research.
arXiv Detail & Related papers (2023-11-16T07:01:48Z)
Do LLMs exhibit human-like response biases? A case study in survey design [66.1850490474361]
We investigate the extent to which large language models (LLMs) reflect human response biases, if at all. We design a dataset and framework to evaluate whether LLMs exhibit human-like response biases in survey questionnaires. Our comprehensive evaluation of nine models shows that popular open and commercial LLMs generally fail to reflect human-like behavior.
arXiv Detail & Related papers (2023-11-07T15:40:43Z)
CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations [61.9212914612875]
We present a framework to characterize LLM simulations using four dimensions: Context, Model, Persona, and Topic. We use this framework to measure open-ended LLM simulations' susceptibility to caricature, defined via two criteria: individuation and exaggeration. We find that for GPT-4, simulations of certain demographics (political and marginalized groups) and topics (general, uncontroversial) are highly susceptible to caricature.
arXiv Detail & Related papers (2023-10-17T18:00:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.