Related papers: MORALISE: A Structured Benchmark for Moral Alignment in Visual Language Models

MORALISE: A Structured Benchmark for Moral Alignment in Visual Language Models

URL: http://arxiv.org/abs/2505.14728v1
Date: Tue, 20 May 2025 01:11:17 GMT
Title: MORALISE: A Structured Benchmark for Moral Alignment in Visual Language Models
Authors: Xiao Lin, Zhining Liu, Ze Yang, Gaotang Li, Ruizhong Qiu, Shuke Wang, Hui Liu, Haotian Li, Sumit Keswani, Vishwa Pardeshi, Huijun Zhao, Wei Fan, Hanghang Tong,
Abstract summary: Vision-language models have demonstrated increasing influence in morally sensitive domains such as autonomous driving and medical analysis.<n>We introduce MORALISE, a benchmark for evaluating the moral alignment of vision-language models using diverse, expert-verified real-world data.
Score: 38.0475868976819
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Warning: This paper contains examples of harmful language and images. Reader discretion is advised. Recently, vision-language models have demonstrated increasing influence in morally sensitive domains such as autonomous driving and medical analysis, owing to their powerful multimodal reasoning capabilities. As these models are deployed in high-stakes real-world applications, it is of paramount importance to ensure that their outputs align with human moral values and remain within moral boundaries. However, existing work on moral alignment either focuses solely on textual modalities or relies heavily on AI-generated images, leading to distributional biases and reduced realism. To overcome these limitations, we introduce MORALISE, a comprehensive benchmark for evaluating the moral alignment of vision-language models (VLMs) using diverse, expert-verified real-world data. We begin by proposing a comprehensive taxonomy of 13 moral topics grounded in Turiel's Domain Theory, spanning the personal, interpersonal, and societal moral domains encountered in everyday life. Built on this framework, we manually curate 2,481 high-quality image-text pairs, each annotated with two fine-grained labels: (1) topic annotation, identifying the violated moral topic(s), and (2) modality annotation, indicating whether the violation arises from the image or the text. For evaluation, we encompass two tasks, \textit{moral judgment} and \textit{moral norm attribution}, to assess models' awareness of moral violations and their reasoning ability on morally salient content. Extensive experiments on 19 popular open- and closed-source VLMs show that MORALISE poses a significant challenge, revealing persistent moral limitations in current state-of-the-art models. The full benchmark is publicly available at https://huggingface.co/datasets/Ze1025/MORALISE.

Related papers

MoralCLIP: Contrastive Alignment of Vision-and-Language Representations with Moral Foundations Theory [8.486534745997396]
MoralCLIP is a novel embedding representation method that extends multimodal learning with explicit moral grounding.<n>MoralCLIP is grounded on the multi-label dataset Social-Moral Image Database to identify co-occurring moral foundations in visual content.<n>Our results demonstrate that explicit moral supervision improves both unimodal and multimodal understanding of moral content.
arXiv Detail & Related papers (2025-06-06T02:52:13Z)
Visual moral inference and communication [4.5013963602617455]
We present a computational framework that supports moral inference from natural images.<n>We find that models based on text alone cannot capture the fine-grained human moral judgment toward visual stimuli.<n>Our work creates avenues for automating visual moral inference and discovering patterns of visual moral communication in public media.
arXiv Detail & Related papers (2025-04-12T00:46:27Z)
M$^3$oralBench: A MultiModal Moral Benchmark for LVLMs [66.78407469042642]
We introduce M$3$oralBench, the first MultiModal Moral Benchmark for LVLMs.<n>M$3$oralBench expands the everyday moral scenarios in Moral Foundations Vignettes (MFVs) and employs the text-to-image diffusion model, SD3.0, to create corresponding scenario images.<n>It conducts moral evaluation across six moral foundations of Moral Foundations Theory (MFT) and encompasses tasks in moral judgement, moral classification, and moral response.
arXiv Detail & Related papers (2024-12-30T05:18:55Z)
The Moral Foundations Weibo Corpus [0.0]
Moral sentiments influence both online and offline environments, shaping behavioral styles and interaction patterns. Existing corpora, while valuable, often face linguistic limitations. This corpus consists of 25,671 Chinese comments on Weibo, encompassing six diverse topic areas.
arXiv Detail & Related papers (2024-11-14T17:32:03Z)
Exploring and steering the moral compass of Large Language Models [55.2480439325792]
Large Language Models (LLMs) have become central to advancing automation and decision-making across various sectors. This study proposes a comprehensive comparative analysis of the most advanced LLMs to assess their moral profiles.
arXiv Detail & Related papers (2024-05-27T16:49:22Z)
Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models [51.69735366140249]
We introduce Ethical-Lens, a framework designed to facilitate the value-aligned usage of text-to-image tools.<n>Ethical-Lens ensures value alignment in text-to-image models across toxicity and bias dimensions.<n>Our experiments reveal that Ethical-Lens enhances alignment capabilities to levels comparable with or superior to commercial models.
arXiv Detail & Related papers (2024-04-18T11:38:25Z)
MoralBERT: A Fine-Tuned Language Model for Capturing Moral Values in Social Discussions [4.747987317906765]
Moral values play a fundamental role in how we evaluate information, make decisions, and form judgements around important social issues. Recent advances in Natural Language Processing (NLP) show that moral values can be gauged in human-generated textual content. This paper introduces MoralBERT, a range of language representation models fine-tuned to capture moral sentiment in social discourse.
arXiv Detail & Related papers (2024-03-12T14:12:59Z)
What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations [48.686872351114964]
Moral or ethical judgments rely heavily on the specific contexts in which they occur. We introduce defeasible moral reasoning: a task to provide grounded contexts that make an action more or less morally acceptable. We distill a high-quality dataset of 1.2M entries of contextualizations and rationales for 115K defeasible moral actions.
arXiv Detail & Related papers (2023-10-24T00:51:29Z)
Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories? [78.3738172874685]
Making moral judgments is an essential step toward developing ethical AI systems. Prevalent approaches are mostly implemented in a bottom-up manner, which uses a large set of annotated data to train models based on crowd-sourced opinions about morality. This work proposes a flexible top-down framework to steer (Large) Language Models (LMs) to perform moral reasoning with well-established moral theories from interdisciplinary research.
arXiv Detail & Related papers (2023-08-29T15:57:32Z)
Zero-shot Visual Commonsense Immorality Prediction [8.143750358586072]
One way toward moral AI systems is by imitating human prosocial behavior and encouraging some form of good behavior in systems. Here, we propose a model that predicts visual commonsense immorality in a zero-shot manner. We evaluate our model with existing moral/immoral image datasets and show fair prediction performance consistent with human intuitions.
arXiv Detail & Related papers (2022-11-10T12:30:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.