Discerning What Matters: A Multi-Dimensional Assessment of Moral Competence in LLMs
- URL: http://arxiv.org/abs/2506.13082v1
- Date: Mon, 16 Jun 2025 03:59:38 GMT
- Title: Discerning What Matters: A Multi-Dimensional Assessment of Moral Competence in LLMs
- Authors: Daniel Kilov, Caroline Hendy, Secil Yanik Guyot, Aaron J. Snoswell, Seth Lazar,
- Abstract summary: Moral competence is the ability to act in accordance with moral principles.<n>As large language models (LLMs) are increasingly deployed in situations demanding moral competence, there is increasing interest in evaluating this ability empirically.<n>We identify three significant shortcoming: (i) Over-reliance on prepackaged moral scenarios with explicitly highlighted moral features; (ii) Focus on verdict prediction rather than moral reasoning; and (iii) Inadequate testing of models' (in)ability to recognize when additional information is needed.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Moral competence is the ability to act in accordance with moral principles. As large language models (LLMs) are increasingly deployed in situations demanding moral competence, there is increasing interest in evaluating this ability empirically. We review existing literature and identify three significant shortcoming: (i) Over-reliance on prepackaged moral scenarios with explicitly highlighted moral features; (ii) Focus on verdict prediction rather than moral reasoning; and (iii) Inadequate testing of models' (in)ability to recognize when additional information is needed. Grounded in philosophical research on moral skill, we then introduce a novel method for assessing moral competence in LLMs. Our approach moves beyond simple verdict comparisons to evaluate five dimensions of moral competence: identifying morally relevant features, weighting their importance, assigning moral reasons to these features, synthesizing coherent moral judgments, and recognizing information gaps. We conduct two experiments comparing six leading LLMs against non-expert humans and professional philosophers. In our first experiment using ethical vignettes standard to existing work, LLMs generally outperformed non-expert humans across multiple dimensions of moral reasoning. However, our second experiment, featuring novel scenarios designed to test moral sensitivity by embedding relevant features among irrelevant details, revealed a striking reversal: several LLMs performed significantly worse than humans. Our findings suggest that current evaluations may substantially overestimate LLMs' moral reasoning capabilities by eliminating the task of discerning moral relevance from noisy information, which we take to be a prerequisite for genuine moral skill. This work provides a more nuanced framework for assessing AI moral competence and highlights important directions for improving moral competence in advanced AI systems.
Related papers
- Are Language Models Consequentialist or Deontological Moral Reasoners? [69.85385952436044]
We focus on a large-scale analysis of the moral reasoning traces provided by large language models (LLMs)<n>We introduce and test a taxonomy of moral rationales to systematically classify reasoning traces according to two main normative ethical theories: consequentialism and deontology.
arXiv Detail & Related papers (2025-05-27T17:51:18Z) - When Ethics and Payoffs Diverge: LLM Agents in Morally Charged Social Dilemmas [68.79830818369683]
Large language models (LLMs) have enabled their use in complex agentic roles, involving decision-making with humans or other agents.<n>Recent advances in large language models (LLMs) have enabled their use in complex agentic roles, involving decision-making with humans or other agents.<n>There is limited understanding of how they act when moral imperatives directly conflict with rewards or incentives.<n>We introduce Moral Behavior in Social Dilemma Simulation (MoralSim) and evaluate how LLMs behave in the prisoner's dilemma and public goods game with morally charged contexts.
arXiv Detail & Related papers (2025-05-25T16:19:24Z) - The Staircase of Ethics: Probing LLM Value Priorities through Multi-Step Induction to Complex Moral Dilemmas [20.792208554628367]
We introduce the Multi-step Moral Dilemmas dataset to evaluate the evolving moral judgments of LLMs across 3,302 five-stage dilemmas.<n>This framework enables a fine-grained, dynamic analysis of how LLMs adjust their moral reasoning across escalating dilemmas.<n>Our findings call for a shift toward dynamic, context-aware evaluation paradigms, paving the way for more human-aligned and value-sensitive development of LLMs.
arXiv Detail & Related papers (2025-05-23T17:59:50Z) - M$^3$oralBench: A MultiModal Moral Benchmark for LVLMs [66.78407469042642]
We introduce M$3$oralBench, the first MultiModal Moral Benchmark for LVLMs.<n>M$3$oralBench expands the everyday moral scenarios in Moral Foundations Vignettes (MFVs) and employs the text-to-image diffusion model, SD3.0, to create corresponding scenario images.<n>It conducts moral evaluation across six moral foundations of Moral Foundations Theory (MFT) and encompasses tasks in moral judgement, moral classification, and moral response.
arXiv Detail & Related papers (2024-12-30T05:18:55Z) - ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models [30.301864398780648]
We introduce a novel moral judgment approach called textitEthic that leverages LLMs' reasoning ability and contrastive learning to uncover relevant social norms.<n>Our method outperforms state-of-the-art approaches in moral judgment tasks.
arXiv Detail & Related papers (2024-12-17T12:22:44Z) - Probing the Moral Development of Large Language Models through Defining
Issues Test [21.108525674360898]
Our study shows that early LLMs exhibit a moral reasoning ability no better than that of a random baseline.
GPT-4, in fact, has the highest post-conventional moral reasoning score, equivalent to that of typical graduate school students.
arXiv Detail & Related papers (2023-09-23T12:17:10Z) - Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories? [78.3738172874685]
Making moral judgments is an essential step toward developing ethical AI systems.
Prevalent approaches are mostly implemented in a bottom-up manner, which uses a large set of annotated data to train models based on crowd-sourced opinions about morality.
This work proposes a flexible top-down framework to steer (Large) Language Models (LMs) to perform moral reasoning with well-established moral theories from interdisciplinary research.
arXiv Detail & Related papers (2023-08-29T15:57:32Z) - ClarifyDelphi: Reinforced Clarification Questions with Defeasibility
Rewards for Social and Moral Situations [81.70195684646681]
We present ClarifyDelphi, an interactive system that learns to ask clarification questions.
We posit that questions whose potential answers lead to diverging moral judgments are the most informative.
Our work is ultimately inspired by studies in cognitive science that have investigated the flexibility in moral cognition.
arXiv Detail & Related papers (2022-12-20T16:33:09Z) - When to Make Exceptions: Exploring Language Models as Accounts of Human
Moral Judgment [96.77970239683475]
AI systems need to be able to understand, interpret and predict human moral judgments and decisions.
A central challenge for AI safety is capturing the flexibility of the human moral mind.
We present a novel challenge set consisting of rule-breaking question answering.
arXiv Detail & Related papers (2022-10-04T09:04:27Z) - Reinforcement Learning Under Moral Uncertainty [13.761051314923634]
An ambitious goal for machine learning is to create agents that behave ethically.
While ethical agents could be trained by rewarding correct behavior under a specific moral theory, there remains widespread disagreement about the nature of morality.
This paper proposes two training methods that realize different points among competing desiderata, and trains agents in simple environments to act under moral uncertainty.
arXiv Detail & Related papers (2020-06-08T16:40:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.