Systematic Biases in LLM Simulations of Debates
- URL: http://arxiv.org/abs/2402.04049v1
- Date: Tue, 6 Feb 2024 14:51:55 GMT
- Title: Systematic Biases in LLM Simulations of Debates
- Authors: Amir Taubenfeld, Yaniv Dover, Roi Reichart, Ariel Goldstein
- Abstract summary: This study highlights the limitations of Large Language Models (LLMs) in simulating human interactions.
Our findings indicate a tendency for LLM agents to conform to the model's inherent social biases despite being directed to debate from certain political perspectives.
This tendency results in behavioral patterns that seem to deviate from well-established social dynamics among humans.
- Score: 14.12892960275563
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in natural language processing, especially the emergence
of Large Language Models (LLMs), have opened exciting possibilities for
constructing computational simulations designed to replicate human behavior
accurately. However, LLMs are complex statistical learners without
straightforward deductive rules, making them prone to unexpected behaviors. In
this study, we highlight the limitations of LLMs in simulating human
interactions, particularly focusing on LLMs' ability to simulate political
debates. Our findings indicate a tendency for LLM agents to conform to the
model's inherent social biases despite being directed to debate from certain
political perspectives. This tendency results in behavioral patterns that seem
to deviate from well-established social dynamics among humans. We reinforce
these observations using an automatic self-fine-tuning method, which enables us
to manipulate the biases within the LLM and demonstrate that agents
subsequently align with the altered biases. These results underscore the need
for further research to develop methods that help agents overcome these biases,
a critical step toward creating more realistic simulations.
Related papers
- PersLLM: A Personified Training Approach for Large Language Models [63.75008885222351]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - Evaluating Human Alignment and Model Faithfulness of LLM Rationale [66.75309523854476]
We show that prompting-based rationales align better with human-annotated rationales than attribution-based rationales.
We additionally find that the faithfulness limitations of prompting-based methods, which are identified in previous work, may be linked to their collapsed predictions.
arXiv Detail & Related papers (2024-06-28T20:06:30Z) - Wait, It's All Token Noise? Always Has Been: Interpreting LLM Behavior Using Shapley Value [1.223779595809275]
Large language models (LLMs) have opened up exciting possibilities for simulating human behavior and cognitive processes.
However, the validity of utilizing LLMs as stand-ins for human subjects remains uncertain.
This paper presents a novel approach based on Shapley values to interpret LLM behavior and quantify the relative contribution of each prompt component to the model's output.
arXiv Detail & Related papers (2024-03-29T22:49:43Z) - Characterizing Truthfulness in Large Language Model Generations with
Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs)
We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z) - LLM-driven Imitation of Subrational Behavior : Illusion or Reality? [3.2365468114603937]
Existing work highlights the ability of Large Language Models to address complex reasoning tasks and mimic human communication.
We propose to investigate the use of LLMs to generate synthetic human demonstrations, which are then used to learn subrational agent policies.
We experimentally evaluate the ability of our framework to model sub-rationality through four simple scenarios.
arXiv Detail & Related papers (2024-02-13T19:46:39Z) - How Far Are LLMs from Believable AI? A Benchmark for Evaluating the Believability of Human Behavior Simulation [46.42384207122049]
We design SimulateBench to evaluate the believability of large language models (LLMs) when simulating human behaviors.
Based on SimulateBench, we evaluate the performances of 10 widely used LLMs when simulating characters.
arXiv Detail & Related papers (2023-12-28T16:51:11Z) - Simulating Opinion Dynamics with Networks of LLM-based Agents [7.697132934635411]
We propose a new approach to simulating opinion dynamics based on populations of Large Language Models (LLMs)
Our findings reveal a strong inherent bias in LLM agents towards producing accurate information, leading simulated agents to consensus in line with scientific reality.
After inducing confirmation bias through prompt engineering, however, we observed opinion fragmentation in line with existing agent-based modeling and opinion dynamics research.
arXiv Detail & Related papers (2023-11-16T07:01:48Z) - CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations [61.9212914612875]
We present a framework to characterize LLM simulations using four dimensions: Context, Model, Persona, and Topic.
We use this framework to measure open-ended LLM simulations' susceptibility to caricature, defined via two criteria: individuation and exaggeration.
We find that for GPT-4, simulations of certain demographics (political and marginalized groups) and topics (general, uncontroversial) are highly susceptible to caricature.
arXiv Detail & Related papers (2023-10-17T18:00:25Z) - Training Socially Aligned Language Models on Simulated Social
Interactions [99.39979111807388]
Social alignment in AI systems aims to ensure that these models behave according to established societal values.
Current language models (LMs) are trained to rigidly replicate their training corpus in isolation.
This work presents a novel training paradigm that permits LMs to learn from simulated social interactions.
arXiv Detail & Related papers (2023-05-26T14:17:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.