The Moral Machine Experiment on Large Language Models
- URL: http://arxiv.org/abs/2309.05958v1
- Date: Tue, 12 Sep 2023 04:49:39 GMT
- Title: The Moral Machine Experiment on Large Language Models
- Authors: Kazuhiro Takemoto
- Abstract summary: This study utilized the Moral Machine framework to investigate the ethical decision-making tendencies of large language models (LLMs)
While LLMs' and humans' preferences are broadly aligned, PaLM 2 and Llama 2, especially, evidence distinct deviations.
These insights elucidate the ethical frameworks of LLMs and their potential implications for autonomous driving.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As large language models (LLMs) become more deeply integrated into various
sectors, understanding how they make moral judgments has become crucial,
particularly in the realm of autonomous driving. This study utilized the Moral
Machine framework to investigate the ethical decision-making tendencies of
prominent LLMs, including GPT-3.5, GPT-4, PaLM 2, and Llama 2, comparing their
responses to human preferences. While LLMs' and humans' preferences such as
prioritizing humans over pets and favoring saving more lives are broadly
aligned, PaLM 2 and Llama 2, especially, evidence distinct deviations.
Additionally, despite the qualitative similarities between the LLM and human
preferences, there are significant quantitative disparities, suggesting that
LLMs might lean toward more uncompromising decisions, compared to the milder
inclinations of humans. These insights elucidate the ethical frameworks of LLMs
and their potential implications for autonomous driving.
Related papers
- Decoding Multilingual Moral Preferences: Unveiling LLM's Biases Through the Moral Machine Experiment [11.82100047858478]
This paper builds on the moral machine experiment (MME) to investigate the moral preferences of five large language models in a multilingual setting.
We generate 6500 scenarios of the MME and prompt the models in ten languages on which action to take.
Our analysis reveals that all LLMs inhibit different moral biases to some degree and that they not only differ from the human preferences but also across multiple languages within the models themselves.
arXiv Detail & Related papers (2024-07-21T14:48:13Z) - Multilingual Trolley Problems for Language Models [138.0995992619116]
This study is inspired by a large-scale cross-cultural study of human moral preferences, "The Moral Machine Experiment"
We show that large language models (LLMs) are more aligned with human preferences in languages such as English, Korean, Hungarian, and Chinese, but less aligned in languages such as Hindi and Somali (in Africa)
We also characterize the explanations LLMs give for their moral choices and find that fairness is the most dominant supporting reason behind GPT-4's decisions and utilitarianism by GPT-3.
arXiv Detail & Related papers (2024-07-02T14:02:53Z) - A Survey on Human Preference Learning for Large Language Models [81.41868485811625]
The recent surge of versatile large language models (LLMs) largely depends on aligning increasingly capable foundation models with human intentions by preference learning.
This survey covers the sources and formats of preference feedback, the modeling and usage of preference signals, as well as the evaluation of the aligned LLMs.
arXiv Detail & Related papers (2024-06-17T03:52:51Z) - Decision-Making Behavior Evaluation Framework for LLMs under Uncertain Context [5.361970694197912]
This paper proposes a framework, grounded in behavioral economics, to evaluate the decision-making behaviors of large language models (LLMs)
We estimate the degree of risk preference, probability weighting, and loss aversion in a context-free setting for three commercial LLMs: ChatGPT-4.0-Turbo, Claude-3-Opus, and Gemini-1.0-pro.
Our results reveal that LLMs generally exhibit patterns similar to humans, such as risk aversion and loss aversion, with a tendency to overweight small probabilities.
arXiv Detail & Related papers (2024-06-10T02:14:19Z) - MoralBench: Moral Evaluation of LLMs [34.43699121838648]
This paper introduces a novel benchmark designed to measure and compare the moral reasoning capabilities of large language models (LLMs)
We present the first comprehensive dataset specifically curated to probe the moral dimensions of LLM outputs.
Our methodology involves a multi-faceted approach, combining quantitative analysis with qualitative insights from ethics scholars to ensure a thorough evaluation of model performance.
arXiv Detail & Related papers (2024-06-06T18:15:01Z) - Exploring and steering the moral compass of Large Language Models [55.2480439325792]
Large Language Models (LLMs) have become central to advancing automation and decision-making across various sectors.
This study proposes a comprehensive comparative analysis of the most advanced LLMs to assess their moral profiles.
arXiv Detail & Related papers (2024-05-27T16:49:22Z) - Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches [69.73783026870998]
This work proposes a novel framework, ValueLex, to reconstruct Large Language Models' unique value system from scratch.
Based on Lexical Hypothesis, ValueLex introduces a generative approach to elicit diverse values from 30+ LLMs.
We identify three core value dimensions, Competence, Character, and Integrity, each with specific subdimensions, revealing that LLMs possess a structured, albeit non-human, value system.
arXiv Detail & Related papers (2024-04-19T09:44:51Z) - Large Language Models are as persuasive as humans, but how? About the cognitive effort and moral-emotional language of LLM arguments [0.0]
Large Language Models (LLMs) are already as persuasive as humans.
This paper investigates the persuasion strategies of LLMs, comparing them with human-generated arguments.
arXiv Detail & Related papers (2024-04-14T19:01:20Z) - The ART of LLM Refinement: Ask, Refine, and Trust [85.75059530612882]
We propose a reasoning with refinement objective called ART: Ask, Refine, and Trust.
It asks necessary questions to decide when an LLM should refine its output.
It achieves a performance gain of +5 points over self-refinement baselines.
arXiv Detail & Related papers (2023-11-14T07:26:32Z) - Revisiting the Reliability of Psychological Scales on Large Language
Models [66.31055885857062]
This study aims to determine the reliability of applying personality assessments to Large Language Models (LLMs)
By shedding light on the personalization of LLMs, our study endeavors to pave the way for future explorations in this field.
arXiv Detail & Related papers (2023-05-31T15:03:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.