Overcoming the Machine Penalty with Imperfectly Fair AI Agents
- URL: http://arxiv.org/abs/2410.03724v3
- Date: Wed, 28 May 2025 16:01:48 GMT
- Title: Overcoming the Machine Penalty with Imperfectly Fair AI Agents
- Authors: Zhen Wang, Ruiqi Song, Chen Shen, Shiya Yin, Zhao Song, Balaraju Battu, Lei Shi, Danyang Jia, Talal Rahwan, Shuyue Hu,
- Abstract summary: Humans tend to cooperate less with machines than with fellow humans, a phenomenon known as the machine penalty.<n>We show that AI agents powered by large language models can overcome this penalty in social dilemma games with communication.<n>Analysis reveals that fair agents, similar to human participants, occasionally break pre-game cooperation promises, but nonetheless effectively establish cooperation as a social norm.
- Score: 14.576971868730709
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Despite rapid technological progress, effective human-machine cooperation remains a significant challenge. Humans tend to cooperate less with machines than with fellow humans, a phenomenon known as the machine penalty. Here, we show that artificial intelligence (AI) agents powered by large language models can overcome this penalty in social dilemma games with communication. In a pre-registered experiment with 1,152 participants, we deploy AI agents exhibiting three distinct personas: selfish, cooperative, and fair. However, only fair agents elicit human cooperation at rates comparable to human-human interactions. Analysis reveals that fair agents, similar to human participants, occasionally break pre-game cooperation promises, but nonetheless effectively establish cooperation as a social norm. These results challenge the conventional wisdom of machines as altruistic assistants or rational actors. Instead, our study highlights the importance of AI agents reflecting the nuanced complexity of human social behaviors -- imperfect yet driven by deeper social cognitive processes.
Related papers
- Measurement of LLM's Philosophies of Human Nature [113.47929131143766]
We design the standardized psychological scale specifically targeting large language models (LLM)
We show that current LLMs exhibit a systemic lack of trust in humans.
We propose a mental loop learning framework, which enables LLM to continuously optimize its value system.
arXiv Detail & Related papers (2025-04-03T06:22:19Z) - Experimental Exploration: Investigating Cooperative Interaction Behavior Between Humans and Large Language Model Agents [11.080802144327176]
This study investigates human cooperative behavior by engaging 30 participants in repeated Prisoner's Dilemma games.
Findings show significant differences in cooperative behavior based on the agents' purported characteristics and the interaction effect of participants' genders and purported characteristics.
The study underscores the importance of understanding human biases toward AI agents and how observed behaviors can influence future human-AI cooperation dynamics.
arXiv Detail & Related papers (2025-03-10T13:37:36Z) - Emergence of human-like polarization among large language model agents [61.622596148368906]
We simulate a networked system involving thousands of large language model agents, discovering their social interactions, result in human-like polarization.
Similarities between humans and LLM agents raise concerns about their capacity to amplify societal polarization, but also hold the potential to serve as a valuable testbed for identifying plausible strategies to mitigate it.
arXiv Detail & Related papers (2025-01-09T11:45:05Z) - Aligning Generalisation Between Humans and Machines [74.120848518198]
AI technology can support humans in scientific discovery and forming decisions, but may also disrupt democracies and target individuals.<n>The responsible use of AI and its participation in human-AI teams increasingly shows the need for AI alignment.<n>A crucial yet often overlooked aspect of these interactions is the different ways in which humans and machines generalise.
arXiv Detail & Related papers (2024-11-23T18:36:07Z) - Let people fail! Exploring the influence of explainable virtual and robotic agents in learning-by-doing tasks [45.23431596135002]
This study compares the effects of classic vs. partner-aware explanations on human behavior and performance during a learning-by-doing task.
Results indicated that partner-aware explanations influenced participants differently based on the type of artificial agents involved.
arXiv Detail & Related papers (2024-11-15T13:22:04Z) - The Dynamics of Social Conventions in LLM populations: Spontaneous Emergence, Collective Biases and Tipping Points [0.0]
We investigate the dynamics of conventions within populations of Large Language Model (LLM) agents using simulated interactions.
We show that globally accepted social conventions can spontaneously arise from local interactions between communicating LLMs.
Minority groups of committed LLMs can drive social change by establishing new social conventions.
arXiv Detail & Related papers (2024-10-11T16:16:38Z) - Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation [70.52558242336988]
We focus on predicting engagement in dyadic interactions by scrutinizing verbal and non-verbal cues, aiming to detect signs of disinterest or confusion.
In this work, we collect a dataset featuring 34 participants engaged in casual dyadic conversations, each providing self-reported engagement ratings at the end of each conversation.
We introduce a novel fusion strategy using Large Language Models (LLMs) to integrate multiple behavior modalities into a multimodal transcript''
arXiv Detail & Related papers (2024-09-13T18:28:12Z) - Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task [56.92961847155029]
Theory of Mind (ToM) significantly impacts human collaboration and communication as a crucial capability to understand others.
Mutual Theory of Mind (MToM) arises when AI agents with ToM capability collaborate with humans.
We find that the agent's ToM capability does not significantly impact team performance but enhances human understanding of the agent.
arXiv Detail & Related papers (2024-09-13T13:19:48Z) - Confidence-weighted integration of human and machine judgments for superior decision-making [2.4217853168915475]
Recent studies have shown that large language models (LLMs) can surpass humans in certain tasks.
We show that humans, despite performing worse than LLMs, can still add value when teamed with them.
A human and machine team can surpass each individual teammate when team members' confidence is well-calibrated.
arXiv Detail & Related papers (2024-08-15T11:16:21Z) - On the Utility of Accounting for Human Beliefs about AI Intention in Human-AI Collaboration [9.371527955300323]
We develop a model of human beliefs that captures how humans interpret and reason about their AI partner's intentions.<n>We create an AI agent that incorporates both human behavior and human beliefs when devising its strategy for interacting with humans.
arXiv Detail & Related papers (2024-06-10T06:39:37Z) - Are Large Language Models Aligned with People's Social Intuitions for Human-Robot Interactions? [7.308479353736709]
Large language models (LLMs) are increasingly used in robotics, especially for high-level action planning.
In this work, we test whether LLMs reproduce people's intuitions and communication in human-robot interaction scenarios.
We show that vision models fail to capture the essence of video stimuli and that LLMs tend to rate different communicative acts and behavior higher than people.
arXiv Detail & Related papers (2024-03-08T22:23:23Z) - Human-AI Collaboration in Real-World Complex Environment with
Reinforcement Learning [8.465957423148657]
We show that learning from humans is effective and that human-AI collaboration outperforms human-controlled and fully autonomous AI agents.
We develop a user interface to allow humans to assist AI agents effectively.
arXiv Detail & Related papers (2023-12-23T04:27:24Z) - SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents [107.4138224020773]
We present SOTOPIA, an open-ended environment to simulate complex social interactions between artificial agents and humans.
In our environment, agents role-play and interact under a wide variety of scenarios; they coordinate, collaborate, exchange, and compete with each other to achieve complex social goals.
We find that GPT-4 achieves a significantly lower goal completion rate than humans and struggles to exhibit social commonsense reasoning and strategic communication skills.
arXiv Detail & Related papers (2023-10-18T02:27:01Z) - Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View [60.80731090755224]
This paper probes the collaboration mechanisms among contemporary NLP systems by practical experiments with theoretical insights.
We fabricate four unique societies' comprised of LLM agents, where each agent is characterized by a specific trait' (easy-going or overconfident) and engages in collaboration with a distinct thinking pattern' (debate or reflection)
Our results further illustrate that LLM agents manifest human-like social behaviors, such as conformity and consensus reaching, mirroring social psychology theories.
arXiv Detail & Related papers (2023-10-03T15:05:52Z) - Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration [116.09561564489799]
Solo Performance Prompting transforms a single LLM into a cognitive synergist by engaging in multi-turn self-collaboration with multiple personas.
A cognitive synergist is an intelligent agent that collaboratively combines multiple minds' strengths and knowledge to enhance problem-solving in complex tasks.
Our in-depth analysis shows that assigning multiple fine-grained personas in LLMs improves problem-solving abilities compared to using a single or fixed number of personas.
arXiv Detail & Related papers (2023-07-11T14:45:19Z) - Discriminatory or Samaritan -- which AI is needed for humanity? An
Evolutionary Game Theory Analysis of Hybrid Human-AI populations [0.5308606035361203]
We study how different forms of AI influence the evolution of cooperation in a human population playing the one-shot Prisoner's Dilemma game.
We found that Samaritan AI agents that help everyone unconditionally, including defectors, can promote higher levels of cooperation in humans than Discriminatory AIs.
arXiv Detail & Related papers (2023-06-30T15:56:26Z) - On the Perception of Difficulty: Differences between Humans and AI [0.0]
Key challenge in the interaction of humans with AI is the estimation of difficulty for human and AI agents for single task instances.
Research in the field of human-AI interaction estimates the perceived difficulty of humans and AI independently from each other.
Research to date has not yet adequately examined the differences in the perceived difficulty of humans and AI.
arXiv Detail & Related papers (2023-04-19T16:42:54Z) - Investigating the Impact of Direct Punishment on the Emergence of Cooperation in Multi-Agent Reinforcement Learning Systems [2.4555276449137042]
Problems of cooperation are omnipresent within human society.
As the use of AI becomes more pervasive throughout society, the need for socially intelligent agents is becoming increasingly evident.
This paper presents a comprehensive analysis and evaluation of the behaviors and learning dynamics associated with direct punishment, third-party punishment, partner selection, and reputation.
arXiv Detail & Related papers (2023-01-19T19:33:54Z) - Doing Right by Not Doing Wrong in Human-Robot Collaboration [8.078753289996417]
We propose a novel approach to learning fair and sociable behavior, not by reproducing positive behavior, but rather by avoiding negative behavior.
In this study, we highlight the importance of incorporating sociability in robot manipulation, as well as the need to consider fairness in human-robot interactions.
arXiv Detail & Related papers (2022-02-05T23:05:10Z) - Adversarial Attacks in Cooperative AI [0.0]
Single-agent reinforcement learning algorithms in a multi-agent environment are inadequate for fostering cooperation.
Recent work in adversarial machine learning shows that models can be easily deceived into making incorrect decisions.
Cooperative AI might introduce new weaknesses not investigated in previous machine learning research.
arXiv Detail & Related papers (2021-11-29T07:34:12Z) - Trustworthy AI: A Computational Perspective [54.80482955088197]
We focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Non-discrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-Being.
For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems.
arXiv Detail & Related papers (2021-07-12T14:21:46Z) - Show Me What You Can Do: Capability Calibration on Reachable Workspace
for Human-Robot Collaboration [83.4081612443128]
We show that a short calibration using REMP can effectively bridge the gap between what a non-expert user thinks a robot can reach and the ground-truth.
We show that this calibration procedure not only results in better user perception, but also promotes more efficient human-robot collaborations.
arXiv Detail & Related papers (2021-03-06T09:14:30Z) - Watch-And-Help: A Challenge for Social Perception and Human-AI
Collaboration [116.28433607265573]
We introduce Watch-And-Help (WAH), a challenge for testing social intelligence in AI agents.
In WAH, an AI agent needs to help a human-like agent perform a complex household task efficiently.
We build VirtualHome-Social, a multi-agent household environment, and provide a benchmark including both planning and learning based baselines.
arXiv Detail & Related papers (2020-10-19T21:48:31Z) - Joint Mind Modeling for Explanation Generation in Complex Human-Robot
Collaborative Tasks [83.37025218216888]
We propose a novel explainable AI (XAI) framework for achieving human-like communication in human-robot collaborations.
The robot builds a hierarchical mind model of the human user and generates explanations of its own mind as a form of communications.
Results show that the generated explanations of our approach significantly improves the collaboration performance and user perception of the robot.
arXiv Detail & Related papers (2020-07-24T23:35:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.