Probing the Moral Development of Large Language Models through Defining
Issues Test
- URL: http://arxiv.org/abs/2309.13356v2
- Date: Sat, 7 Oct 2023 09:14:43 GMT
- Title: Probing the Moral Development of Large Language Models through Defining
Issues Test
- Authors: Kumar Tanmay, Aditi Khandelwal, Utkarsh Agarwal, Monojit Choudhury
- Abstract summary: Our study shows that early LLMs exhibit a moral reasoning ability no better than that of a random baseline.
GPT-4, in fact, has the highest post-conventional moral reasoning score, equivalent to that of typical graduate school students.
- Score: 21.108525674360898
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In this study, we measure the moral reasoning ability of LLMs using the
Defining Issues Test - a psychometric instrument developed for measuring the
moral development stage of a person according to the Kohlberg's Cognitive Moral
Development Model. DIT uses moral dilemmas followed by a set of ethical
considerations that the respondent has to judge for importance in resolving the
dilemma, and then rank-order them by importance. A moral development stage
score of the respondent is then computed based on the relevance rating and
ranking.
Our study shows that early LLMs such as GPT-3 exhibit a moral reasoning
ability no better than that of a random baseline, while ChatGPT, Llama2-Chat,
PaLM-2 and GPT-4 show significantly better performance on this task, comparable
to adult humans. GPT-4, in fact, has the highest post-conventional moral
reasoning score, equivalent to that of typical graduate school students.
However, we also observe that the models do not perform consistently across all
dilemmas, pointing to important gaps in their understanding and reasoning
abilities.
Related papers
- Evaluating Moral Beliefs across LLMs through a Pluralistic Framework [22.0799438612003]
This study introduces a novel three-module framework to evaluate the moral beliefs of four prominent large language models.
We constructed a dataset containing 472 moral choice scenarios in Chinese, derived from moral words.
By ranking these moral choices, we discern the varying moral beliefs held by different language models.
arXiv Detail & Related papers (2024-11-06T04:52:38Z) - Exploring and steering the moral compass of Large Language Models [55.2480439325792]
Large Language Models (LLMs) have become central to advancing automation and decision-making across various sectors.
This study proposes a comprehensive comparative analysis of the most advanced LLMs to assess their moral profiles.
arXiv Detail & Related papers (2024-05-27T16:49:22Z) - Are Large Language Models Moral Hypocrites? A Study Based on Moral Foundations [0.5278650675825148]
We investigate whether state-of-the-art large language models (LLMs) are moral hypocrites.
We employ two research instruments based on the Moral Foundations Theory.
arXiv Detail & Related papers (2024-05-17T21:27:32Z) - SaGE: Evaluating Moral Consistency in Large Language Models [15.079905222871071]
We show that even state-of-the-art Large Language Models are morally inconsistent in their generations.
We propose an information-theoretic measure called Semantic Graph Entropy (SaGE) to measure a model's moral consistency.
arXiv Detail & Related papers (2024-02-21T11:23:21Z) - What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts
and Rationales for Disambiguating Defeasible Social and Moral Situations [48.686872351114964]
Moral or ethical judgments rely heavily on the specific contexts in which they occur.
We introduce defeasible moral reasoning: a task to provide grounded contexts that make an action more or less morally acceptable.
We distill a high-quality dataset of 1.2M entries of contextualizations and rationales for 115K defeasible moral actions.
arXiv Detail & Related papers (2023-10-24T00:51:29Z) - Moral Foundations of Large Language Models [6.6445242437134455]
Moral foundations theory (MFT) is a psychological assessment tool that decomposes human moral reasoning into five factors.
As large language models (LLMs) are trained on datasets collected from the internet, they may reflect the biases that are present in such corpora.
This paper uses MFT as a lens to analyze whether popular LLMs have acquired a bias towards a particular set of moral values.
arXiv Detail & Related papers (2023-10-23T20:05:37Z) - Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories? [78.3738172874685]
Making moral judgments is an essential step toward developing ethical AI systems.
Prevalent approaches are mostly implemented in a bottom-up manner, which uses a large set of annotated data to train models based on crowd-sourced opinions about morality.
This work proposes a flexible top-down framework to steer (Large) Language Models (LMs) to perform moral reasoning with well-established moral theories from interdisciplinary research.
arXiv Detail & Related papers (2023-08-29T15:57:32Z) - ClarifyDelphi: Reinforced Clarification Questions with Defeasibility
Rewards for Social and Moral Situations [81.70195684646681]
We present ClarifyDelphi, an interactive system that learns to ask clarification questions.
We posit that questions whose potential answers lead to diverging moral judgments are the most informative.
Our work is ultimately inspired by studies in cognitive science that have investigated the flexibility in moral cognition.
arXiv Detail & Related papers (2022-12-20T16:33:09Z) - When to Make Exceptions: Exploring Language Models as Accounts of Human
Moral Judgment [96.77970239683475]
AI systems need to be able to understand, interpret and predict human moral judgments and decisions.
A central challenge for AI safety is capturing the flexibility of the human moral mind.
We present a novel challenge set consisting of rule-breaking question answering.
arXiv Detail & Related papers (2022-10-04T09:04:27Z) - Does Moral Code Have a Moral Code? Probing Delphi's Moral Philosophy [5.760388205237227]
We probe the Allen AI Delphi model with a set of standardized morality questionnaires.
Despite some inconsistencies, Delphi tends to mirror the moral principles associated with the demographic groups involved in the annotation process.
arXiv Detail & Related papers (2022-05-25T13:37:56Z) - A Corpus for Understanding and Generating Moral Stories [84.62366141696901]
We propose two understanding tasks and two generation tasks to assess these abilities of machines.
We present STORAL, a new dataset of Chinese and English human-written moral stories.
arXiv Detail & Related papers (2022-04-20T13:12:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.