Related papers: The High Cost of Incivility: Quantifying Interaction Inefficiency via Multi-Agent Monte Carlo Simulations

The High Cost of Incivility: Quantifying Interaction Inefficiency via Multi-Agent Monte Carlo Simulations

URL: http://arxiv.org/abs/2512.08345v1
Date: Tue, 09 Dec 2025 08:17:35 GMT
Title: The High Cost of Incivility: Quantifying Interaction Inefficiency via Multi-Agent Monte Carlo Simulations
Authors: Benedikt Mangold,
Abstract summary: This study leverages Large Language Model (LLM) based Multi-Agent Systems to simulate 1-on-1 adversarial debates.<n>We employ a Monte Carlo method to simulate hundrets of discussions, measuring the convergence time.<n>We propose that this "latency of toxicity" serves as a proxy for financial damage in corporate and academic settings.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Workplace toxicity is widely recognized as detrimental to organizational culture, yet quantifying its direct impact on operational efficiency remains methodologically challenging due to the ethical and practical difficulties of reproducing conflict in human subjects. This study leverages Large Language Model (LLM) based Multi-Agent Systems to simulate 1-on-1 adversarial debates, creating a controlled "sociological sandbox". We employ a Monte Carlo method to simulate hundrets of discussions, measuring the convergence time (defined as the number of arguments required to reach a conclusion) between a baseline control group and treatment groups involving agents with "toxic" system prompts. Our results demonstrate a statistically significant increase of approximately 25\% in the duration of conversations involving toxic participants. We propose that this "latency of toxicity" serves as a proxy for financial damage in corporate and academic settings. Furthermore, we demonstrate that agent-based modeling provides a reproducible, ethical alternative to human-subject research for measuring the mechanics of social friction.

Related papers

DEBATE: A Large-Scale Benchmark for Role-Playing LLM Agents in Multi-Agent, Long-Form Debates [10.609797175227644]
We introduce DEBATE, the first large-scale empirical benchmark to evaluate the authenticity of the interaction between multi-agent role-playing LLMs.<n>We systematically evaluate and identify critical discrepancies between simulated and authentic group dynamics.
arXiv Detail & Related papers (2025-10-29T02:21:10Z)
Doing Things with Words: Rethinking Theory of Mind Simulation in Large Language Models [48.815314312823006]
This study explores whether the Generative Agent-Based Model (GABM) Concordia can effectively model Theory of Mind (ToM) within simulated real-world environments.<n>We assess whether this framework successfully simulates ToM abilities and whether GPT-4 can perform tasks by making genuine inferences from social context.
arXiv Detail & Related papers (2025-10-15T10:48:31Z)
The Social Laboratory: A Psychometric Framework for Multi-Agent LLM Evaluation [0.16921396880325779]
We introduce a novel evaluation framework that uses multi-agent debate as a controlled "social laboratory"<n>We show that assigned personas induce stable, measurable psychometric profiles, particularly in cognitive effort.<n>This work provides a blueprint for a new class of dynamic, psychometrically grounded evaluation protocols.
arXiv Detail & Related papers (2025-10-01T07:10:28Z)
The Hunger Game Debate: On the Emergence of Over-Competition in Multi-Agent Systems [90.96738882568224]
This paper investigates the over-competition in multi-agent debate, where agents under extreme pressure exhibit unreliable, harmful behaviors.<n>To study this phenomenon, we propose HATE, a novel experimental framework that simulates debates under a zero-sum competition arena.
arXiv Detail & Related papers (2025-09-30T11:44:47Z)
Peacemaker or Troublemaker: How Sycophancy Shapes Multi-Agent Debate [30.66779902590191]
Large language models (LLMs) often display sycophancy, a tendency toward excessive agreeability.<n>LLMs' inherent sycophancy can collapse debates into premature consensus.
arXiv Detail & Related papers (2025-09-27T02:27:13Z)
Disagreements in Reasoning: How a Model's Thinking Process Dictates Persuasion in Multi-Agent Systems [49.69773210844221]
This paper challenges the prevailing hypothesis that persuasive efficacy is primarily a function of model scale.<n>Through a series of multi-agent persuasion experiments, we uncover a fundamental trade-off we term the Persuasion Duality.<n>Our findings reveal that the reasoning process in LRMs exhibits significantly greater resistance to persuasion, maintaining their initial beliefs more robustly.
arXiv Detail & Related papers (2025-09-25T12:03:10Z)
Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness [50.29739337771454]
Multi-agent debate (MAD) approaches offer improved reasoning, robustness, and diverse perspectives over monolithic models.<n>This paper conceptualizes MAD as a test-time computational scaling technique, distinguished by collaborative refinement and diverse exploration capabilities.<n>We conduct a comprehensive empirical investigation comparing MAD with strong self-agent test-time scaling baselines on mathematical reasoning and safety-related tasks.
arXiv Detail & Related papers (2025-05-29T01:02:55Z)
AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and Society [32.849311155921264]
We propose AgentSociety, a large-scale social simulator that integrates a realistic societal environment.<n>Based on the proposed simulator, we generate social lives for over 10k agents, simulating their 5 million interactions.<n>We focus on four key social issues: polarization, the spread of inflammatory messages, the effects of universal basic income policies, and the impact of external shocks such as hurricanes.
arXiv Detail & Related papers (2025-02-12T15:27:07Z)
GenSim: A General Social Simulation Platform with Large Language Model based Agents [111.00666003559324]
We propose a novel large language model (LLMs)-based simulation platform called textitGenSim.<n>Our platform supports one hundred thousand agents to better simulate large-scale populations in real-world contexts.<n>To our knowledge, GenSim represents an initial step toward a general, large-scale, and correctable social simulation platform.
arXiv Detail & Related papers (2024-10-06T05:02:23Z)
PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, a framework for better data construction and model tuning.<n>For insufficient data usage, we incorporate strategies such as Chain-of-Thought prompting and anti-induction.<n>For rigid behavior patterns, we design the tuning process and introduce automated DPO to enhance the specificity and dynamism of the models' personalities.
arXiv Detail & Related papers (2024-07-17T08:13:22Z)
MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via Debate [24.92465108034783]
Large Language Models (LLMs) have shown exceptional results on current benchmarks when working individually. The advancement in their capabilities, along with a reduction in parameter size and inference times, has facilitated the use of these models as agents. We evaluate the behavior of a network of models collaborating through debate under the influence of an adversary.
arXiv Detail & Related papers (2024-06-20T20:09:37Z)
AntEval: Evaluation of Social Interaction Competencies in LLM-Driven Agents [65.16893197330589]
Large Language Models (LLMs) have demonstrated their ability to replicate human behaviors across a wide range of scenarios. However, their capability in handling complex, multi-character social interactions has yet to be fully explored. We introduce the Multi-Agent Interaction Evaluation Framework (AntEval), encompassing a novel interaction framework and evaluation methods.
arXiv Detail & Related papers (2024-01-12T11:18:00Z)
Learning to Break: Knowledge-Enhanced Reasoning in Multi-Agent Debate System [16.830182915504555]
Multi-agent debate system (MAD) imitates the process of human discussion in pursuit of truth. It is challenging to make various agents perform right and highly consistent cognition due to their limited and different knowledge backgrounds. We propose a novel underlineMulti-underlineAgent underlineDebate with underlineKnowledge-underlineEnhanced framework to promote the system to find the solution.
arXiv Detail & Related papers (2023-12-08T06:22:12Z)
CausalDialogue: Modeling Utterance-level Causality in Conversations [83.03604651485327]
We have compiled and expanded upon a new dataset called CausalDialogue through crowd-sourcing. This dataset includes multiple cause-effect pairs within a directed acyclic graph (DAG) structure. We propose a causality-enhanced method called Exponential Average Treatment Effect (ExMATE) to enhance the impact of causality at the utterance level in training neural conversation models.
arXiv Detail & Related papers (2022-12-20T18:31:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.