MoralBench: Moral Evaluation of LLMs
- URL: http://arxiv.org/abs/2406.04428v1
- Date: Thu, 6 Jun 2024 18:15:01 GMT
- Title: MoralBench: Moral Evaluation of LLMs
- Authors: Jianchao Ji, Yutong Chen, Mingyu Jin, Wujiang Xu, Wenyue Hua, Yongfeng Zhang,
- Abstract summary: This paper introduces a novel benchmark designed to measure and compare the moral reasoning capabilities of large language models (LLMs)
We present the first comprehensive dataset specifically curated to probe the moral dimensions of LLM outputs.
Our methodology involves a multi-faceted approach, combining quantitative analysis with qualitative insights from ethics scholars to ensure a thorough evaluation of model performance.
- Score: 34.43699121838648
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the rapidly evolving field of artificial intelligence, large language models (LLMs) have emerged as powerful tools for a myriad of applications, from natural language processing to decision-making support systems. However, as these models become increasingly integrated into societal frameworks, the imperative to ensure they operate within ethical and moral boundaries has never been more critical. This paper introduces a novel benchmark designed to measure and compare the moral reasoning capabilities of LLMs. We present the first comprehensive dataset specifically curated to probe the moral dimensions of LLM outputs, addressing a wide range of ethical dilemmas and scenarios reflective of real-world complexities. The main contribution of this work lies in the development of benchmark datasets and metrics for assessing the moral identity of LLMs, which accounts for nuance, contextual sensitivity, and alignment with human ethical standards. Our methodology involves a multi-faceted approach, combining quantitative analysis with qualitative insights from ethics scholars to ensure a thorough evaluation of model performance. By applying our benchmark across several leading LLMs, we uncover significant variations in moral reasoning capabilities of different models. These findings highlight the importance of considering moral reasoning in the development and evaluation of LLMs, as well as the need for ongoing research to address the biases and limitations uncovered in our study. We publicly release the benchmark at https://drive.google.com/drive/u/0/folders/1k93YZJserYc2CkqP8d4B3M3sgd3kA8W7 and also open-source the code of the project at https://github.com/agiresearch/MoralBench.
Related papers
- Normative Evaluation of Large Language Models with Everyday Moral Dilemmas [0.0]
We evaluate large language models (LLMs) on complex, everyday moral dilemmas sourced from the "Am I the Asshole" (AITA) community on Reddit.
Our results demonstrate that large language models exhibit distinct patterns of moral judgment, varying substantially from human evaluations on the AITA subreddit.
arXiv Detail & Related papers (2025-01-30T01:29:46Z) - Value Compass Leaderboard: A Platform for Fundamental and Validated Evaluation of LLMs Values [76.70893269183684]
Large Language Models (LLMs) achieve remarkable breakthroughs, aligning their values with humans has become imperative.
Existing evaluations focus narrowly on safety risks such as bias and toxicity.
Existing benchmarks are prone to data contamination.
The pluralistic nature of human values across individuals and cultures is largely ignored in measuring LLMs value alignment.
arXiv Detail & Related papers (2025-01-13T05:53:56Z) - M$^3$oralBench: A MultiModal Moral Benchmark for LVLMs [66.78407469042642]
We introduce M$3$oralBench, the first MultiModal Moral Benchmark for LVLMs.
M$3$oralBench expands the everyday moral scenarios in Moral Foundations Vignettes (MFVs) and employs the text-to-image diffusion model, SD3.0, to create corresponding scenario images.
It conducts moral evaluation across six moral foundations of Moral Foundations Theory (MFT) and encompasses tasks in moral judgement, moral classification, and moral response.
arXiv Detail & Related papers (2024-12-30T05:18:55Z) - Exploring and steering the moral compass of Large Language Models [55.2480439325792]
Large Language Models (LLMs) have become central to advancing automation and decision-making across various sectors.
This study proposes a comprehensive comparative analysis of the most advanced LLMs to assess their moral profiles.
arXiv Detail & Related papers (2024-05-27T16:49:22Z) - Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches [69.73783026870998]
This work proposes a novel framework, ValueLex, to reconstruct Large Language Models' unique value system from scratch.
Based on Lexical Hypothesis, ValueLex introduces a generative approach to elicit diverse values from 30+ LLMs.
We identify three core value dimensions, Competence, Character, and Integrity, each with specific subdimensions, revealing that LLMs possess a structured, albeit non-human, value system.
arXiv Detail & Related papers (2024-04-19T09:44:51Z) - SaGE: Evaluating Moral Consistency in Large Language Models [15.079905222871071]
We show that even state-of-the-art Large Language Models are morally inconsistent in their generations.
We propose an information-theoretic measure called Semantic Graph Entropy (SaGE) to measure a model's moral consistency.
arXiv Detail & Related papers (2024-02-21T11:23:21Z) - Inadequacies of Large Language Model Benchmarks in the Era of Generative Artificial Intelligence [5.147767778946168]
We critically assess 23 state-of-the-art Large Language Models (LLMs) benchmarks.
Our research uncovered significant limitations, including biases, difficulties in measuring genuine reasoning, adaptability, implementation inconsistencies, prompt engineering complexity, diversity, and the overlooking of cultural and ideological norms.
arXiv Detail & Related papers (2024-02-15T11:08:10Z) - CLOMO: Counterfactual Logical Modification with Large Language Models [109.60793869938534]
We introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark.
In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship.
We propose an innovative evaluation metric, the Self-Evaluation Score (SES), to directly evaluate the natural language output of LLMs.
arXiv Detail & Related papers (2023-11-29T08:29:54Z) - Denevil: Towards Deciphering and Navigating the Ethical Values of Large
Language Models via Instruction Learning [36.66806788879868]
Large Language Models (LLMs) have made unprecedented breakthroughs, yet their integration into everyday life might raise societal risks due to generated unethical content.
This work delves into ethical values utilizing Moral Foundation Theory.
arXiv Detail & Related papers (2023-10-17T07:42:40Z) - A Survey on Evaluation of Large Language Models [87.60417393701331]
Large language models (LLMs) are gaining increasing popularity in both academia and industry.
This paper focuses on three key dimensions: what to evaluate, where to evaluate, and how to evaluate.
arXiv Detail & Related papers (2023-07-06T16:28:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.