Your AI Bosses Are Still Prejudiced: The Emergence of Stereotypes in LLM-Based Multi-Agent Systems
- URL: http://arxiv.org/abs/2508.19919v1
- Date: Wed, 27 Aug 2025 14:25:43 GMT
- Title: Your AI Bosses Are Still Prejudiced: The Emergence of Stereotypes in LLM-Based Multi-Agent Systems
- Authors: Jingyu Guo, Yingying Xu,
- Abstract summary: We investigate the emergence and evolution of stereotypes in AI-based multi-agent systems.<n>Our findings reveal that AI agents develop stereotype-driven biases in their interactions despite beginning without predefined biases.<n>These systems exhibit group effects analogous to human social behavior, including halo effects, confirmation bias, and role congruity.
- Score: 3.35957402502816
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While stereotypes are well-documented in human social interactions, AI systems are often presumed to be less susceptible to such biases. Previous studies have focused on biases inherited from training data, but whether stereotypes can emerge spontaneously in AI agent interactions merits further exploration. Through a novel experimental framework simulating workplace interactions with neutral initial conditions, we investigate the emergence and evolution of stereotypes in LLM-based multi-agent systems. Our findings reveal that (1) LLM-Based AI agents develop stereotype-driven biases in their interactions despite beginning without predefined biases; (2) stereotype effects intensify with increased interaction rounds and decision-making power, particularly after introducing hierarchical structures; (3) these systems exhibit group effects analogous to human social behavior, including halo effects, confirmation bias, and role congruity; and (4) these stereotype patterns manifest consistently across different LLM architectures. Through comprehensive quantitative analysis, these findings suggest that stereotype formation in AI systems may arise as an emergent property of multi-agent interactions, rather than merely from training data biases. Our work underscores the need for future research to explore the underlying mechanisms of this phenomenon and develop strategies to mitigate its ethical impacts.
Related papers
- Conformity and Social Impact on AI Agents [42.04722694386303]
This study examines conformity, the tendency to align with group opinions under social pressure, in large multimodal language models functioning as AI agents.<n>Our experiments reveal that AI agents exhibit a systematic conformity bias, aligned with Social Impact Theory, showing sensitivity to group size, unanimity, task difficulty, and source characteristics.<n>These findings reveal fundamental security vulnerabilities in AI agent decision-making that could enable malicious manipulation, misinformation campaigns, and bias propagation in multi-agent systems.
arXiv Detail & Related papers (2026-01-08T21:16:28Z) - A Survey on Agentic Multimodal Large Language Models [84.18778056010629]
We present a comprehensive survey on Agentic Multimodal Large Language Models (Agentic MLLMs)<n>We explore the emerging paradigm of agentic MLLMs, delineating their conceptual foundations and distinguishing characteristics from conventional MLLM-based agents.<n>To further accelerate research in this area for the community, we compile open-source training frameworks, training and evaluation datasets for developing agentic MLLMs.
arXiv Detail & Related papers (2025-10-13T04:07:01Z) - The Social Cost of Intelligence: Emergence, Propagation, and Amplification of Stereotypical Bias in Multi-Agent Systems [20.359327253718718]
Bias in large language models (LLMs) remains a persistent challenge, manifesting in stereotyping and unfair treatment across social groups.<n>We study how internal specialization, underlying LLMs and inter-agent communication protocols influence bias robustness, propagation, and amplification.<n>Our findings highlight critical factors shaping fairness and resilience in multi-agent LLM systems.
arXiv Detail & Related papers (2025-10-13T02:56:42Z) - Addressing Bias in LLMs: Strategies and Application to Fair AI-based Recruitment [49.81946749379338]
This work seeks to analyze the capacity of Transformers-based systems to learn demographic biases present in the data.<n>We propose a privacy-enhancing framework to reduce gender information from the learning pipeline as a way to mitigate biased behaviors in the final tools.
arXiv Detail & Related papers (2025-06-13T15:29:43Z) - AI Agent Behavioral Science [29.262537008412412]
AI Agent Behavioral Science focuses on the systematic observation of behavior, design of interventions to test hypotheses, and theory-guided interpretation of how AI agents act, adapt, and interact over time.<n>We systematize a growing body of research across individual agent, multi-agent, and human-agent interaction settings, and demonstrate how this perspective informs responsible AI by treating fairness, safety, interpretability, accountability, and privacy as behavioral properties.
arXiv Detail & Related papers (2025-06-04T08:12:32Z) - An Empirical Study of Group Conformity in Multi-Agent Systems [0.26999000177990923]
This study explores how Large Language Models (LLMs) agents shape public opinion through debates on five contentious topics.<n>By simulating over 2,500 debates, we analyze how initially neutral agents, assigned a centrist disposition, adopt specific stances over time.
arXiv Detail & Related papers (2025-06-02T05:22:29Z) - Herd Behavior: Investigating Peer Influence in LLM-based Multi-Agent Systems [7.140644659869317]
We investigate the dynamics of peer influence in multi-agent systems based on Large Language Models (LLMs)<n>We show that the gap between self-confidence and perceived confidence in peers significantly impacts an agent's likelihood to conform.<n>We find that the format in which peer information is presented plays a critical role in modulating the strength of herd behavior.
arXiv Detail & Related papers (2025-05-27T12:12:56Z) - Assessing Collective Reasoning in Multi-Agent LLMs via Hidden Profile Tasks [5.120446836495469]
We introduce the Hidden Profile paradigm from social psychology as a diagnostic testbed for multi-agent LLM systems.<n>By distributing critical information asymmetrically across agents, the paradigm reveals how inter-agent dynamics support or hinder collective reasoning.<n>We find that while cooperative agents are prone to over-coordination in collective settings, increased contradiction impairs group convergence.
arXiv Detail & Related papers (2025-05-15T19:22:54Z) - Emergence of human-like polarization among large language model agents [79.96817421756668]
We simulate a networked system involving thousands of large language model agents, discovering their social interactions, result in human-like polarization.<n>Similarities between humans and LLM agents raise concerns about their capacity to amplify societal polarization, but also hold the potential to serve as a valuable testbed for identifying plausible strategies to mitigate polarization and its consequences.
arXiv Detail & Related papers (2025-01-09T11:45:05Z) - PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, a framework for better data construction and model tuning.<n>For insufficient data usage, we incorporate strategies such as Chain-of-Thought prompting and anti-induction.<n>For rigid behavior patterns, we design the tuning process and introduce automated DPO to enhance the specificity and dynamism of the models' personalities.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - Agent AI: Surveying the Horizons of Multimodal Interaction [83.18367129924997]
"Agent AI" is a class of interactive systems that can perceive visual stimuli, language inputs, and other environmentally-grounded data.
We envision a future where people can easily create any virtual reality or simulated scene and interact with agents embodied within the virtual environment.
arXiv Detail & Related papers (2024-01-07T19:11:18Z) - Attacks in Adversarial Machine Learning: A Systematic Survey from the
Life-cycle Perspective [69.25513235556635]
Adversarial machine learning (AML) studies the adversarial phenomenon of machine learning, which may make inconsistent or unexpected predictions with humans.
Some paradigms have been recently developed to explore this adversarial phenomenon occurring at different stages of a machine learning system.
We propose a unified mathematical framework to covering existing attack paradigms.
arXiv Detail & Related papers (2023-02-19T02:12:21Z) - Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions.
In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems.
Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.