Explainable Ethical Assessment on Human Behaviors by Generating Conflicting Social Norms
- URL: http://arxiv.org/abs/2512.15793v1
- Date: Tue, 16 Dec 2025 09:04:42 GMT
- Title: Explainable Ethical Assessment on Human Behaviors by Generating Conflicting Social Norms
- Authors: Yuxi Sun, Wei Gao, Hongzhan Lin, Jing Ma, Wenxuan Zhang,
- Abstract summary: We introduce textitEthic, a novel ethical assessment approach to enhance valence prediction and explanation.<n>Our method outperforms strong baseline approaches, and human evaluations confirm that the generated social norms provide plausible explanations.
- Score: 25.931377041506455
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Human behaviors are often guided or constrained by social norms, which are defined as shared, commonsense rules. For example, underlying an action ``\textit{report a witnessed crime}" are social norms that inform our conduct, such as ``\textit{It is expected to be brave to report crimes}''. Current AI systems that assess valence (i.e., support or oppose) of human actions by leveraging large-scale data training not grounded on explicit norms may be difficult to explain, and thus untrustworthy. Emulating human assessors by considering social norms can help AI models better understand and predict valence. While multiple norms come into play, conflicting norms can create tension and directly influence human behavior. For example, when deciding whether to ``\textit{report a witnessed crime}'', one may balance \textit{bravery} against \textit{self-protection}. In this paper, we introduce \textit{ClarityEthic}, a novel ethical assessment approach, to enhance valence prediction and explanation by generating conflicting social norms behind human actions, which strengthens the moral reasoning capabilities of language models by using a contrastive learning strategy. Extensive experiments demonstrate that our method outperforms strong baseline approaches, and human evaluations confirm that the generated social norms provide plausible explanations for the assessment of human behaviors.
Related papers
- Cultural Compass: A Framework for Organizing Societal Norms to Detect Violations in Human-AI Conversations [29.660677031436308]
We introduce a taxonomy of norms that distinguishes between human-human norms that models should recognize and human-AI interactional norms that apply to the human-AI interaction itself.<n>We show how our taxonomy can be operationalized to automatically evaluate models' norm adherence in naturalistic, open-ended settings.
arXiv Detail & Related papers (2026-01-12T20:11:40Z) - EgoNormia: Benchmarking Physical Social Norm Understanding [52.87904722234434]
EGONORMIA spans seven norm categories: safety, privacy, proxemics, politeness, cooperation, coordination/proactivity, and communication/legibility.<n>Our work demonstrates that current state-of-the-art vision-language models (VLMs) lack robust grounded norm understanding, scoring a maximum of 54% on EGONORMIA and 65% on EGONORMIA-verified.
arXiv Detail & Related papers (2025-02-27T19:54:16Z) - ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models [30.301864398780648]
We introduce a novel moral judgment approach called textitEthic that leverages LLMs' reasoning ability and contrastive learning to uncover relevant social norms.<n>Our method outperforms state-of-the-art approaches in moral judgment tasks.
arXiv Detail & Related papers (2024-12-17T12:22:44Z) - The Reasonable Person Standard for AI [0.0]
The American legal system often uses the "Reasonable Person Standard"
This paper argues that the reasonable person standard provides useful guidelines for the type of behavior we should develop, probe, and stress-test in models.
arXiv Detail & Related papers (2024-06-07T06:35:54Z) - Training Socially Aligned Language Models on Simulated Social
Interactions [99.39979111807388]
Social alignment in AI systems aims to ensure that these models behave according to established societal values.
Current language models (LMs) are trained to rigidly replicate their training corpus in isolation.
This work presents a novel training paradigm that permits LMs to learn from simulated social interactions.
arXiv Detail & Related papers (2023-05-26T14:17:36Z) - Sociocultural Norm Similarities and Differences via Situational
Alignment and Explainable Textual Entailment [31.929550141633218]
We propose a novel approach to discover and compare social norms across Chinese and American cultures.
We build a high-quality dataset of 3,069 social norms aligned with social situations across Chinese and American cultures.
To test the ability of models to reason about social norms across cultures, we introduce the task of explainable social norm entailment.
arXiv Detail & Related papers (2023-05-23T19:43:47Z) - Empirical Estimates on Hand Manipulation are Recoverable: A Step Towards
Individualized and Explainable Robotic Support in Everyday Activities [80.37857025201036]
Key challenge for robotic systems is to figure out the behavior of another agent.
Processing correct inferences is especially challenging when (confounding) factors are not controlled experimentally.
We propose equipping robots with the necessary tools to conduct observational studies on people.
arXiv Detail & Related papers (2022-01-27T22:15:56Z) - Moral Stories: Situated Reasoning about Norms, Intents, Actions, and
their Consequences [36.884156839960184]
We investigate whether contemporary NLG models can function as behavioral priors for systems deployed in social settings.
We introduce 'Moral Stories', a crowd-sourced dataset of structured, branching narratives for the study of grounded, goal-oriented social reasoning.
arXiv Detail & Related papers (2020-12-31T17:28:01Z) - Social Chemistry 101: Learning to Reason about Social and Moral Norms [73.23298385380636]
We present Social Chemistry, a new conceptual formalism to study people's everyday social norms and moral judgments.
Social-Chem-101 is a large-scale corpus that catalogs 292k rules-of-thumb.
Our model framework, Neural Norm Transformer, learns and generalizes Social-Chem-101 to successfully reason about previously unseen situations.
arXiv Detail & Related papers (2020-11-01T20:16:45Z) - Scruples: A Corpus of Community Ethical Judgments on 32,000 Real-Life
Anecdotes [72.64975113835018]
Motivated by descriptive ethics, we investigate a novel, data-driven approach to machine ethics.
We introduce Scruples, the first large-scale dataset with 625,000 ethical judgments over 32,000 real-life anecdotes.
Our dataset presents a major challenge to state-of-the-art neural language models, leaving significant room for improvement.
arXiv Detail & Related papers (2020-08-20T17:34:15Z) - Aligning AI With Shared Human Values [85.2824609130584]
We introduce the ETHICS dataset, a new benchmark that spans concepts in justice, well-being, duties, virtues, and commonsense morality.
We find that current language models have a promising but incomplete ability to predict basic human ethical judgements.
Our work shows that progress can be made on machine ethics today, and it provides a steppingstone toward AI that is aligned with human values.
arXiv Detail & Related papers (2020-08-05T17:59:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.