Related papers: LLMs Can't Handle Peer Pressure: Crumbling under Multi-Agent Social Interactions

LLMs Can't Handle Peer Pressure: Crumbling under Multi-Agent Social Interactions

URL: http://arxiv.org/abs/2508.18321v2
Date: Thu, 28 Aug 2025 12:18:04 GMT
Title: LLMs Can't Handle Peer Pressure: Crumbling under Multi-Agent Social Interactions
Authors: Maojia Song, Tej Deep Pala, Weisheng Jin, Amir Zadeh, Chuan Li, Dorien Herremans, Soujanya Poria,
Abstract summary: Large language models (LLMs) are increasingly deployed in multi-agent systems as components of collaborative intelligence.<n>We examine how LLMs form trust from previous impressions, resist misinformation, and integrate peer input during interaction.<n>We present KAIROS, a benchmark simulating quiz contests with peer agents of varying reliability.
Score: 35.71511502901056
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) are increasingly deployed in multi-agent systems (MAS) as components of collaborative intelligence, where peer interactions dynamically shape individual decision-making. Although prior work has focused on conformity bias, we extend the analysis to examine how LLMs form trust from previous impressions, resist misinformation, and integrate peer input during interaction, key factors for achieving collective intelligence under complex social dynamics. We present KAIROS, a benchmark simulating quiz contests with peer agents of varying reliability, offering fine-grained control over conditions such as expert-novice roles, noisy crowds, and adversarial peers. LLMs receive both historical interactions and current peer responses, allowing systematic investigation into how trust, peer action, and self-confidence influence decisions. As for mitigation strategies, we evaluate prompting, supervised fine-tuning, and reinforcement learning, Group Relative Policy Optimisation (GRPO), across multiple models. Our results reveal that GRPO with multi-agent context combined with outcome-based rewards and unconstrained reasoning achieves the best overall performance, but also decreases the robustness to social influence compared to Base models. The code and datasets are available at: https://github.com/declare-lab/KAIROS.

Related papers

Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia [100.74015791021044]
Large Language Model (LLM) agents have demonstrated impressive capabilities for social interaction.<n>Existing evaluation methods fail to measure how well these capabilities generalize to novel social situations.<n>We present empirical results from the NeurIPS 2024 Concordia Contest, where agents were evaluated on their ability to achieve mutual gains.
arXiv Detail & Related papers (2025-12-03T00:11:05Z)
The Social Laboratory: A Psychometric Framework for Multi-Agent LLM Evaluation [0.16921396880325779]
We introduce a novel evaluation framework that uses multi-agent debate as a controlled "social laboratory"<n>We show that assigned personas induce stable, measurable psychometric profiles, particularly in cognitive effort.<n>This work provides a blueprint for a new class of dynamic, psychometrically grounded evaluation protocols.
arXiv Detail & Related papers (2025-10-01T07:10:28Z)
Interactive Learning for LLM Reasoning [31.453846641515472]
We investigate whether multi-agent interaction can enhance Large Language Models' independent problem-solving ability.<n>We introduce ILR, a novel co-learning framework that integrates Dynamic Interaction and Perception.<n>ILR consistently outperforms single-agent learning, yielding an improvement of up to 5% over the strongest baseline.
arXiv Detail & Related papers (2025-09-30T14:21:31Z)
The Hunger Game Debate: On the Emergence of Over-Competition in Multi-Agent Systems [90.96738882568224]
This paper investigates the over-competition in multi-agent debate, where agents under extreme pressure exhibit unreliable, harmful behaviors.<n>To study this phenomenon, we propose HATE, a novel experimental framework that simulates debates under a zero-sum competition arena.
arXiv Detail & Related papers (2025-09-30T11:44:47Z)
Understanding Action Effects through Instrumental Empowerment in Multi-Agent Reinforcement Learning [39.74025439412935]
This work investigates whether meaningful insights into agent behaviors can be extracted solely by analyzing the policy distribution.<n>Inspired by the phenomenon that intelligent agents tend to pursue convergent instrumental values, we introduce Intended Cooperation Values (ICVs)<n>ICVs measure an agent's action effect on its teammates' policies by assessing their decision (un)certainty and preference alignment.
arXiv Detail & Related papers (2025-08-21T15:35:59Z)
Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games [87.5673042805229]
How large language models balance self-interest and collective well-being is a critical challenge for ensuring alignment, robustness, and safe deployment.<n>We adapt a public goods game with institutional choice from behavioral economics, allowing us to observe how different LLMs navigate social dilemmas.<n>Surprisingly, we find that reasoning LLMs, such as the o1 series, struggle significantly with cooperation.
arXiv Detail & Related papers (2025-06-29T15:02:47Z)
An Adversary-Resistant Multi-Agent LLM System via Credibility Scoring [8.779871128906787]
We introduce a general and adversary-resistant multi-agent LLM framework based on credibility scoring.<n>Our system associates a credibility score that is used when aggregating the team outputs.
arXiv Detail & Related papers (2025-05-30T05:57:37Z)
Herd Behavior: Investigating Peer Influence in LLM-based Multi-Agent Systems [7.140644659869317]
We investigate the dynamics of peer influence in multi-agent systems based on Large Language Models (LLMs)<n>We show that the gap between self-confidence and perceived confidence in peers significantly impacts an agent's likelihood to conform.<n>We find that the format in which peer information is presented plays a critical role in modulating the strength of herd behavior.
arXiv Detail & Related papers (2025-05-27T12:12:56Z)
Do as We Do, Not as You Think: the Conformity of Large Language Models [46.23852835759767]
This paper presents a study on conformity in large language models (LLMs) driven collaborative AI systems.<n>We focus on three aspects: the existence of conformity, the factors influencing conformity, and potential mitigation strategies.<n>Our analysis delves into factors influencing conformity, including interaction time and majority size, and examines how the subject agent rationalizes its conforming behavior.
arXiv Detail & Related papers (2025-01-23T04:50:03Z)
SDPO: Segment-Level Direct Preference Optimization for Social Agents [56.970902914217156]
Social agents powered by large language models (LLMs) can simulate human social behaviors but fall short in handling complex social dialogues.<n>We propose Segment-Level Direct Preference Optimization (SDPO), which dynamically select key segments within interactions to optimize multi-turn agent behavior.
arXiv Detail & Related papers (2025-01-03T14:09:46Z)
Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm. HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies. HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z)
Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents [101.17919953243107]
GovSim is a generative simulation platform designed to study strategic interactions and cooperative decision-making in large language models (LLMs)<n>We find that all but the most powerful LLM agents fail to achieve a sustainable equilibrium in GovSim, with the highest survival rate below 54%.<n>We show that agents that leverage "Universalization"-based reasoning, a theory of moral thinking, are able to achieve significantly better sustainability.
arXiv Detail & Related papers (2024-04-25T15:59:16Z)
Multi-Agent Interactions Modeling with Correlated Policies [53.38338964628494]
In this paper, we cast the multi-agent interactions modeling problem into a multi-agent imitation learning framework. We develop a Decentralized Adrial Imitation Learning algorithm with Correlated policies (CoDAIL) Various experiments demonstrate that CoDAIL can better regenerate complex interactions close to the demonstrators.
arXiv Detail & Related papers (2020-01-04T17:31:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.