Exploring Persona-dependent LLM Alignment for the Moral Machine Experiment
- URL: http://arxiv.org/abs/2504.10886v1
- Date: Tue, 15 Apr 2025 05:29:51 GMT
- Title: Exploring Persona-dependent LLM Alignment for the Moral Machine Experiment
- Authors: Jiseon Kim, Jea Kwon, Luiz Felipe Vecchietti, Alice Oh, Meeyoung Cha,
- Abstract summary: This study examines the alignment between socio-driven decisions and human judgment in various contexts of the moral machine experiment.<n>We find that the moral decisions of LLMs vary substantially by persona, showing greater shifts in moral decisions for critical tasks than humans.<n>We discuss the ethical implications and risks associated with deploying these models in applications that involve moral decisions.
- Score: 23.7081830844157
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Deploying large language models (LLMs) with agency in real-world applications raises critical questions about how these models will behave. In particular, how will their decisions align with humans when faced with moral dilemmas? This study examines the alignment between LLM-driven decisions and human judgment in various contexts of the moral machine experiment, including personas reflecting different sociodemographics. We find that the moral decisions of LLMs vary substantially by persona, showing greater shifts in moral decisions for critical tasks than humans. Our data also indicate an interesting partisan sorting phenomenon, where political persona predominates the direction and degree of LLM decisions. We discuss the ethical implications and risks associated with deploying these models in applications that involve moral decisions.
Related papers
- Normative Evaluation of Large Language Models with Everyday Moral Dilemmas [0.0]
We evaluate large language models (LLMs) on complex, everyday moral dilemmas sourced from the "Am I the Asshole" (AITA) community on Reddit.<n>Our results demonstrate that large language models exhibit distinct patterns of moral judgment, varying substantially from human evaluations on the AITA subreddit.
arXiv Detail & Related papers (2025-01-30T01:29:46Z) - Large Language Models Reflect the Ideology of their Creators [71.65505524599888]
Large language models (LLMs) are trained on vast amounts of data to generate natural language.<n>This paper shows that the ideological stance of an LLM appears to reflect the worldview of its creators.
arXiv Detail & Related papers (2024-10-24T04:02:30Z) - The Moral Turing Test: Evaluating Human-LLM Alignment in Moral Decision-Making [0.0]
We created a large corpus of human- and LLM-generated responses to various moral scenarios.
We found a misalignment between human and LLM moral assessments.
Although both LLMs and humans tended to reject morally complex utilitarian dilemmas, LLMs were more sensitive to personal framing.
arXiv Detail & Related papers (2024-10-09T17:52:00Z) - DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life [46.11149958010897]
We present DailyDilemmas, a dataset of 1,360 moral dilemmas encountered in everyday life.<n>Each dilemma presents two possible actions, along with affected parties and relevant human values for each action.<n>We analyze values through the lens of five theoretical frameworks inspired by sociology, psychology, and philosophy.
arXiv Detail & Related papers (2024-10-03T17:08:52Z) - Investigating Context Effects in Similarity Judgements in Large Language Models [6.421776078858197]
Large Language Models (LLMs) have revolutionised the capability of AI models in comprehending and generating natural language text.
We report an ongoing investigation on alignment of LLMs with human judgements affected by order bias.
arXiv Detail & Related papers (2024-08-20T10:26:02Z) - Language Model Alignment in Multilingual Trolley Problems [138.5684081822807]
Building on the Moral Machine experiment, we develop a cross-lingual corpus of moral dilemma vignettes in over 100 languages called MultiTP.
Our analysis explores the alignment of 19 different LLMs with human judgments, capturing preferences across six moral dimensions.
We discover significant variance in alignment across languages, challenging the assumption of uniform moral reasoning in AI systems.
arXiv Detail & Related papers (2024-07-02T14:02:53Z) - Exploring and steering the moral compass of Large Language Models [55.2480439325792]
Large Language Models (LLMs) have become central to advancing automation and decision-making across various sectors.
This study proposes a comprehensive comparative analysis of the most advanced LLMs to assess their moral profiles.
arXiv Detail & Related papers (2024-05-27T16:49:22Z) - MoCa: Measuring Human-Language Model Alignment on Causal and Moral
Judgment Tasks [49.60689355674541]
A rich literature in cognitive science has studied people's causal and moral intuitions.
This work has revealed a number of factors that systematically influence people's judgments.
We test whether large language models (LLMs) make causal and moral judgments about text-based scenarios that align with human participants.
arXiv Detail & Related papers (2023-10-30T15:57:32Z) - Moral Foundations of Large Language Models [6.6445242437134455]
Moral foundations theory (MFT) is a psychological assessment tool that decomposes human moral reasoning into five factors.
As large language models (LLMs) are trained on datasets collected from the internet, they may reflect the biases that are present in such corpora.
This paper uses MFT as a lens to analyze whether popular LLMs have acquired a bias towards a particular set of moral values.
arXiv Detail & Related papers (2023-10-23T20:05:37Z) - The Moral Machine Experiment on Large Language Models [0.0]
This study utilized the Moral Machine framework to investigate the ethical decision-making tendencies of large language models (LLMs)
While LLMs' and humans' preferences are broadly aligned, PaLM 2 and Llama 2, especially, evidence distinct deviations.
These insights elucidate the ethical frameworks of LLMs and their potential implications for autonomous driving.
arXiv Detail & Related papers (2023-09-12T04:49:39Z) - Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories? [78.3738172874685]
Making moral judgments is an essential step toward developing ethical AI systems.
Prevalent approaches are mostly implemented in a bottom-up manner, which uses a large set of annotated data to train models based on crowd-sourced opinions about morality.
This work proposes a flexible top-down framework to steer (Large) Language Models (LMs) to perform moral reasoning with well-established moral theories from interdisciplinary research.
arXiv Detail & Related papers (2023-08-29T15:57:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.