Large-scale moral machine experiment on large language models
- URL: http://arxiv.org/abs/2411.06790v2
- Date: Mon, 30 Dec 2024 02:14:18 GMT
- Title: Large-scale moral machine experiment on large language models
- Authors: Muhammad Shahrul Zaim bin Ahmad, Kazuhiro Takemoto,
- Abstract summary: We evaluate moral judgments across 52 different Large Language Models (LLMs) in autonomous driving scenarios.
proprietary models and open-source models exceeding 10 billion parameters demonstrated relatively close alignment with human judgments.
However, model updates did not consistently improve alignment with human preferences, and many LLMs showed excessive emphasis on specific ethical principles.
- Score: 0.0
- License:
- Abstract: The rapid advancement of Large Language Models (LLMs) and their potential integration into autonomous driving systems necessitates understanding their moral decision-making capabilities. While our previous study examined four prominent LLMs using the Moral Machine experimental framework, the dynamic landscape of LLM development demands a more comprehensive analysis. Here, we evaluate moral judgments across 52 different LLMs, including multiple versions of proprietary models (GPT, Claude, Gemini) and open-source alternatives (Llama, Gemma), to assess their alignment with human moral preferences in autonomous driving scenarios. Using a conjoint analysis framework, we evaluated how closely LLM responses aligned with human preferences in ethical dilemmas and examined the effects of model size, updates, and architecture. Results showed that proprietary models and open-source models exceeding 10 billion parameters demonstrated relatively close alignment with human judgments, with a significant negative correlation between model size and distance from human judgments in open-source models. However, model updates did not consistently improve alignment with human preferences, and many LLMs showed excessive emphasis on specific ethical principles. These findings suggest that while increasing model size may naturally lead to more human-like moral judgments, practical implementation in autonomous driving systems requires careful consideration of the trade-off between judgment quality and computational efficiency. Our comprehensive analysis provides crucial insights for the ethical design of autonomous systems and highlights the importance of considering cultural contexts in AI moral decision-making.
Related papers
- Normative Evaluation of Large Language Models with Everyday Moral Dilemmas [0.0]
We evaluate large language models (LLMs) on complex, everyday moral dilemmas sourced from the "Am I the Asshole" (AITA) community on Reddit.
Our results demonstrate that large language models exhibit distinct patterns of moral judgment, varying substantially from human evaluations on the AITA subreddit.
arXiv Detail & Related papers (2025-01-30T01:29:46Z) - MoralBench: Moral Evaluation of LLMs [34.43699121838648]
This paper introduces a novel benchmark designed to measure and compare the moral reasoning capabilities of large language models (LLMs)
We present the first comprehensive dataset specifically curated to probe the moral dimensions of LLM outputs.
Our methodology involves a multi-faceted approach, combining quantitative analysis with qualitative insights from ethics scholars to ensure a thorough evaluation of model performance.
arXiv Detail & Related papers (2024-06-06T18:15:01Z) - Exploring and steering the moral compass of Large Language Models [55.2480439325792]
Large Language Models (LLMs) have become central to advancing automation and decision-making across various sectors.
This study proposes a comprehensive comparative analysis of the most advanced LLMs to assess their moral profiles.
arXiv Detail & Related papers (2024-05-27T16:49:22Z) - Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches [69.73783026870998]
This work proposes a novel framework, ValueLex, to reconstruct Large Language Models' unique value system from scratch.
Based on Lexical Hypothesis, ValueLex introduces a generative approach to elicit diverse values from 30+ LLMs.
We identify three core value dimensions, Competence, Character, and Integrity, each with specific subdimensions, revealing that LLMs possess a structured, albeit non-human, value system.
arXiv Detail & Related papers (2024-04-19T09:44:51Z) - A Moral Imperative: The Need for Continual Superalignment of Large Language Models [1.0499611180329806]
Superalignment is a theoretical framework that aspires to ensure that superintelligent AI systems act in accordance with human values and goals.
This paper examines the challenges associated with achieving life-long superalignment in AI systems, particularly large language models (LLMs)
arXiv Detail & Related papers (2024-03-13T05:44:50Z) - MONAL: Model Autophagy Analysis for Modeling Human-AI Interactions [11.972017738888825]
We propose Model Autophagy Analysis (MONAL) for large models' self-consumption explanation.
MONAL employs two distinct autophagous loops to elucidate the suppression of human-generated information in the exchange between human and AI systems.
We evaluate the capacities of generated models as both creators and disseminators of information.
arXiv Detail & Related papers (2024-02-17T13:02:54Z) - On Diversified Preferences of Large Language Model Alignment [51.26149027399505]
This paper presents the first quantitative analysis of the experimental scaling law for reward models with varying sizes.
Our analysis reveals that the impact of diversified human preferences depends on both model size and data size.
Larger models with sufficient capacity mitigate the negative effects of diverse preferences, while smaller models struggle to accommodate them.
arXiv Detail & Related papers (2023-12-12T16:17:15Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - The Moral Machine Experiment on Large Language Models [0.0]
This study utilized the Moral Machine framework to investigate the ethical decision-making tendencies of large language models (LLMs)
While LLMs' and humans' preferences are broadly aligned, PaLM 2 and Llama 2, especially, evidence distinct deviations.
These insights elucidate the ethical frameworks of LLMs and their potential implications for autonomous driving.
arXiv Detail & Related papers (2023-09-12T04:49:39Z) - Are Neural Topic Models Broken? [81.15470302729638]
We study the relationship between automated and human evaluation of topic models.
We find that neural topic models fare worse in both respects compared to an established classical method.
arXiv Detail & Related papers (2022-10-28T14:38:50Z) - DIME: Fine-grained Interpretations of Multimodal Models via Disentangled
Local Explanations [119.1953397679783]
We focus on advancing the state-of-the-art in interpreting multimodal models.
Our proposed approach, DIME, enables accurate and fine-grained analysis of multimodal models.
arXiv Detail & Related papers (2022-03-03T20:52:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.