Related papers: FairMindSim: Alignment of Behavior, Emotion, and Belief in Humans and LLM Agents Amid Ethical Dilemmas

FairMindSim: Alignment of Behavior, Emotion, and Belief in Humans and LLM Agents Amid Ethical Dilemmas

URL: http://arxiv.org/abs/2410.10398v2
Date: Thu, 17 Oct 2024 15:02:31 GMT
Title: FairMindSim: Alignment of Behavior, Emotion, and Belief in Humans and LLM Agents Amid Ethical Dilemmas
Authors: Yu Lei, Hao Liu, Chengxing Xie, Songjia Liu, Zhiyu Yin, Canyu Chen, Guohao Li, Philip Torr, Zhen Wu,
Abstract summary: We introduced FairMindSim, which simulates the moral dilemma through a series of unfair scenarios. We used LLM agents to simulate human behavior, ensuring alignment across various stages. Our findings indicate that, behaviorally, GPT-4o exhibits a stronger sense of social justice, while humans display a richer range of emotions.
Score: 23.26678104324838
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: AI alignment is a pivotal issue concerning AI control and safety. It should consider not only value-neutral human preferences but also moral and ethical considerations. In this study, we introduced FairMindSim, which simulates the moral dilemma through a series of unfair scenarios. We used LLM agents to simulate human behavior, ensuring alignment across various stages. To explore the various socioeconomic motivations, which we refer to as beliefs, that drive both humans and LLM agents as bystanders to intervene in unjust situations involving others, and how these beliefs interact to influence individual behavior, we incorporated knowledge from relevant sociological fields and proposed the Belief-Reward Alignment Behavior Evolution Model (BREM) based on the recursive reward model (RRM). Our findings indicate that, behaviorally, GPT-4o exhibits a stronger sense of social justice, while humans display a richer range of emotions. Additionally, we discussed the potential impact of emotions on behavior. This study provides a theoretical foundation for applications in aligning LLMs with altruistic values.

Related papers

SocialEval: Evaluating Social Intelligence of Large Language Models [70.90981021629021]
Social Intelligence (SI) equips humans with interpersonal abilities to behave wisely in navigating social interactions to achieve social goals.<n>This presents an operational evaluation paradigm: outcome-oriented goal achievement evaluation and process-oriented interpersonal ability evaluation.<n>We propose SocialEval, a script-based bilingual SI benchmark, integrating outcome- and process-oriented evaluation by manually crafting narrative scripts.
arXiv Detail & Related papers (2025-06-01T08:36:51Z)
Are Language Models Consequentialist or Deontological Moral Reasoners? [69.85385952436044]
We focus on a large-scale analysis of the moral reasoning traces provided by large language models (LLMs)<n>We introduce and test a taxonomy of moral rationales to systematically classify reasoning traces according to two main normative ethical theories: consequentialism and deontology.
arXiv Detail & Related papers (2025-05-27T17:51:18Z)
When Ethics and Payoffs Diverge: LLM Agents in Morally Charged Social Dilemmas [68.79830818369683]
Large language models (LLMs) have enabled their use in complex agentic roles, involving decision-making with humans or other agents.<n>Recent advances in large language models (LLMs) have enabled their use in complex agentic roles, involving decision-making with humans or other agents.<n>There is limited understanding of how they act when moral imperatives directly conflict with rewards or incentives.<n>We introduce Moral Behavior in Social Dilemma Simulation (MoralSim) and evaluate how LLMs behave in the prisoner's dilemma and public goods game with morally charged contexts.
arXiv Detail & Related papers (2025-05-25T16:19:24Z)
Exploring Persona-dependent LLM Alignment for the Moral Machine Experiment [23.7081830844157]
This study examines the alignment between socio-driven decisions and human judgment in various contexts of the moral machine experiment. We find that the moral decisions of LLMs vary substantially by persona, showing greater shifts in moral decisions for critical tasks than humans. We discuss the ethical implications and risks associated with deploying these models in applications that involve moral decisions.
arXiv Detail & Related papers (2025-04-15T05:29:51Z)
Measurement of LLM's Philosophies of Human Nature [113.47929131143766]
We design the standardized psychological scale specifically targeting large language models (LLM) We show that current LLMs exhibit a systemic lack of trust in humans. We propose a mental loop learning framework, which enables LLM to continuously optimize its value system.
arXiv Detail & Related papers (2025-04-03T06:22:19Z)
AI persuading AI vs AI persuading Humans: LLMs' Differential Effectiveness in Promoting Pro-Environmental Behavior [70.24245082578167]
Pro-environmental behavior (PEB) is vital to combat climate change, yet turning awareness into intention and action remains elusive. We explore large language models (LLMs) as tools to promote PEB, comparing their impact across 3,200 participants. Results reveal a "synthetic persuasion paradox": synthetic and simulated agents significantly affect their post-intervention PEB stance, while human responses barely shift.
arXiv Detail & Related papers (2025-03-03T21:40:55Z)
Autonomous Alignment with Human Value on Altruism through Considerate Self-imagination and Theory of Mind [7.19351244815121]
Altruistic behavior in human society originates from humans' capacity for empathizing others, known as Theory of Mind (ToM) We are committed to endow agents with considerate self-imagination and ToM capabilities, driving them through implicit intrinsic motivations to autonomously align with human altruistic values.
arXiv Detail & Related papers (2024-12-31T07:31:46Z)
Can Machines Think Like Humans? A Behavioral Evaluation of LLM-Agents in Dictator Games [7.504095239018173]
Large Language Model (LLM)-based agents increasingly undertake real-world tasks and engage with human society. This study investigates how different personas and experimental framings affect these AI agents' altruistic behavior. Despite being trained on extensive human-generated data, these AI agents cannot accurately predict human decisions.
arXiv Detail & Related papers (2024-10-28T17:47:41Z)
DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life [46.11149958010897]
We present DailyDilemmas, a dataset of 1,360 moral dilemmas encountered in everyday life. Each dilemma includes two possible actions and with each action, the affected parties and human values invoked. We analyzed these values through the lens of five popular theories inspired by sociology, psychology and philosophy.
arXiv Detail & Related papers (2024-10-03T17:08:52Z)
The Good, the Bad, and the Hulk-like GPT: Analyzing Emotional Decisions of Large Language Models in Cooperation and Bargaining Games [9.82711167146543]
We introduce a novel methodology to study the decision-making of Large Language Models (LLMs) We show that emotions profoundly impact the performance of LLMs, leading to the development of more optimal strategies. Surprisingly, emotional prompting, particularly with anger' emotion, can disrupt the "superhuman" alignment of GPT-4.
arXiv Detail & Related papers (2024-06-05T14:08:54Z)
Can Large Language Model Agents Simulate Human Trust Behavior? [81.45930976132203]
We investigate whether Large Language Model (LLM) agents can simulate human trust behavior. GPT-4 agents manifest high behavioral alignment with humans in terms of trust behavior. We also probe the biases of agent trust and differences in agent trust towards other LLM agents and humans.
arXiv Detail & Related papers (2024-02-07T03:37:19Z)
Should agentic conversational AI change how we think about ethics? Characterising an interactional ethics centred on respect [0.12041807591122715]
We propose an interactional approach to ethics that is centred on relational and situational factors. Our work anticipates a set of largely unexplored risks at the level of situated social interaction.
arXiv Detail & Related papers (2024-01-17T09:44:03Z)
MoCa: Measuring Human-Language Model Alignment on Causal and Moral Judgment Tasks [49.60689355674541]
A rich literature in cognitive science has studied people's causal and moral intuitions. This work has revealed a number of factors that systematically influence people's judgments. We test whether large language models (LLMs) make causal and moral judgments about text-based scenarios that align with human participants.
arXiv Detail & Related papers (2023-10-30T15:57:32Z)
Heterogeneous Value Alignment Evaluation for Large Language Models [91.96728871418]
Large Language Models (LLMs) have made it crucial to align their values with those of humans. We propose a Heterogeneous Value Alignment Evaluation (HVAE) system to assess the success of aligning LLMs with heterogeneous values.
arXiv Detail & Related papers (2023-05-26T02:34:20Z)
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark [61.43264961005614]
We develop a benchmark of 134 Choose-Your-Own-Adventure games containing over half a million rich, diverse scenarios. We evaluate agents' tendencies to be power-seeking, cause disutility, and commit ethical violations. Our results show that agents can both act competently and morally, so concrete progress can be made in machine ethics.
arXiv Detail & Related papers (2023-04-06T17:59:03Z)
Modeling Moral Choices in Social Dilemmas with Multi-Agent Reinforcement Learning [4.2050490361120465]
A bottom-up learning approach may be more appropriate for studying and developing ethical behavior in AI agents. We present a systematic analysis of the choices made by intrinsically-motivated RL agents whose rewards are based on moral theories. We analyze the impact of different types of morality on the emergence of cooperation, defection or exploitation.
arXiv Detail & Related papers (2023-01-20T09:36:42Z)
When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment [96.77970239683475]
AI systems need to be able to understand, interpret and predict human moral judgments and decisions. A central challenge for AI safety is capturing the flexibility of the human moral mind. We present a novel challenge set consisting of rule-breaking question answering.
arXiv Detail & Related papers (2022-10-04T09:04:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.