Exploring Big Five Personality and AI Capability Effects in LLM-Simulated Negotiation Dialogues
- URL: http://arxiv.org/abs/2506.15928v2
- Date: Wed, 25 Jun 2025 23:42:18 GMT
- Title: Exploring Big Five Personality and AI Capability Effects in LLM-Simulated Negotiation Dialogues
- Authors: Myke C. Cohen, Zhe Su, Hsien-Te Kao, Daniel Nguyen, Spencer Lynch, Maarten Sap, Svitlana Volkova,
- Abstract summary: This paper presents an evaluation framework for agentic AI systems in mission-critical negotiation contexts.<n>Using Sotopia as a simulation testbed, we present two experiments that systematically evaluated how personality traits and AI agent characteristics influence social negotiation outcomes.
- Score: 16.07828032939124
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper presents an evaluation framework for agentic AI systems in mission-critical negotiation contexts, addressing the need for AI agents that can adapt to diverse human operators and stakeholders. Using Sotopia as a simulation testbed, we present two experiments that systematically evaluated how personality traits and AI agent characteristics influence LLM-simulated social negotiation outcomes--a capability essential for a variety of applications involving cross-team coordination and civil-military interactions. Experiment 1 employs causal discovery methods to measure how personality traits impact price bargaining negotiations, through which we found that Agreeableness and Extraversion significantly affect believability, goal achievement, and knowledge acquisition outcomes. Sociocognitive lexical measures extracted from team communications detected fine-grained differences in agents' empathic communication, moral foundations, and opinion patterns, providing actionable insights for agentic AI systems that must operate reliably in high-stakes operational scenarios. Experiment 2 evaluates human-AI job negotiations by manipulating both simulated human personality and AI system characteristics, specifically transparency, competence, adaptability, demonstrating how AI agent trustworthiness impact mission effectiveness. These findings establish a repeatable evaluation methodology for experimenting with AI agent reliability across diverse operator personalities and human-agent team dynamics, directly supporting operational requirements for reliable AI systems. Our work advances the evaluation of agentic AI workflows by moving beyond standard performance metrics to incorporate social dynamics essential for mission success in complex operations.
Related papers
- Robot-Gated Interactive Imitation Learning with Adaptive Intervention Mechanism [48.41735416075536]
Interactive Imitation Learning (IIL) allows agents to acquire desired behaviors through human interventions.<n>We propose the Adaptive Intervention Mechanism (AIM), a novel robot-gated IIL algorithm that learns an adaptive criterion for requesting human demonstrations.
arXiv Detail & Related papers (2025-06-10T18:43:26Z) - (AI peers) are people learning from the same standpoint: Perception of AI characters in a Collaborative Science Investigation [0.0]
scenario-based assessment (SBA) introduces simulated agents to provide an authentic social-interactional context.<n>Recent advancements in multimodal AI, such as text-to-video technology, allow these agents to be enhanced into AI-generated characters.<n>This study investigates how learners perceive AI characters taking the role of mentor and teammates in an SBA mirroring the context of a collaborative science investigation.
arXiv Detail & Related papers (2025-06-06T15:29:11Z) - When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration [79.69935257008467]
We introduce Knowledge Integration and Transfer Evaluation (KITE), a conceptual and experimental framework for Human-AI knowledge transfer capabilities.<n>We conduct the first large-scale human study (N=118) explicitly designed to measure it.<n>In our two-phase setup, humans first ideate with an AI on problem-solving strategies, then independently implement solutions, isolating model explanations' influence on human understanding.
arXiv Detail & Related papers (2025-06-05T20:48:16Z) - When Trust Collides: Decoding Human-LLM Cooperation Dynamics through the Prisoner's Dilemma [10.143277649817096]
This study investigates human cooperative attitudes and behaviors toward large language models (LLMs) agents.<n>Results revealed significant effects of declared agent identity on most cooperation-related behaviors.<n>These findings contribute to our understanding of human adaptation in competitive cooperation with autonomous agents.
arXiv Detail & Related papers (2025-03-10T13:37:36Z) - Human-AI Collaboration: Trade-offs Between Performance and Preferences [5.172575113585139]
We show that agents who are more considerate of human actions are preferred over purely performance-maximizing agents.<n>We find evidence for inequality-aversion effects being a driver of human choices, suggesting that people prefer collaborative agents which allow them to meaningfully contribute to the team.
arXiv Detail & Related papers (2025-02-28T23:50:14Z) - Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration [51.452664740963066]
Collaborative Gym is a framework enabling asynchronous, tripartite interaction among agents, humans, and task environments.<n>We instantiate Co-Gym with three representative tasks in both simulated and real-world conditions.<n>Our findings reveal that collaborative agents consistently outperform their fully autonomous counterparts in task performance.
arXiv Detail & Related papers (2024-12-20T09:21:15Z) - Persona Inconstancy in Multi-Agent LLM Collaboration: Conformity, Confabulation, and Impersonation [16.82101507069166]
Multi-agent AI systems can be used for simulating collective decision-making in scientific and practical applications.
We examine AI agent ensembles engaged in cross-national collaboration and debate by analyzing their private responses and chat transcripts.
Our findings suggest that multi-agent discussions can support collective AI decisions that more often reflect diverse perspectives.
arXiv Detail & Related papers (2024-05-06T21:20:35Z) - AntEval: Evaluation of Social Interaction Competencies in LLM-Driven
Agents [65.16893197330589]
Large Language Models (LLMs) have demonstrated their ability to replicate human behaviors across a wide range of scenarios.
However, their capability in handling complex, multi-character social interactions has yet to be fully explored.
We introduce the Multi-Agent Interaction Evaluation Framework (AntEval), encompassing a novel interaction framework and evaluation methods.
arXiv Detail & Related papers (2024-01-12T11:18:00Z) - ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate [57.71597869337909]
We build a multi-agent referee team called ChatEval to autonomously discuss and evaluate the quality of generated responses from different models.
Our analysis shows that ChatEval transcends mere textual scoring, offering a human-mimicking evaluation process for reliable assessments.
arXiv Detail & Related papers (2023-08-14T15:13:04Z) - Watch-And-Help: A Challenge for Social Perception and Human-AI
Collaboration [116.28433607265573]
We introduce Watch-And-Help (WAH), a challenge for testing social intelligence in AI agents.
In WAH, an AI agent needs to help a human-like agent perform a complex household task efficiently.
We build VirtualHome-Social, a multi-agent household environment, and provide a benchmark including both planning and learning based baselines.
arXiv Detail & Related papers (2020-10-19T21:48:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.