Related papers: EgoNormia: Benchmarking Physical Social Norm Understanding

EgoNormia: Benchmarking Physical Social Norm Understanding

URL: http://arxiv.org/abs/2502.20490v3
Date: Sun, 04 May 2025 23:41:06 GMT
Title: EgoNormia: Benchmarking Physical Social Norm Understanding
Authors: MohammadHossein Rezaei, Yicheng Fu, Phil Cuvin, Caleb Ziems, Yanzhe Zhang, Hao Zhu, Diyi Yang,
Abstract summary: We present dataset $|epsilon|$, consisting of 1,853 challenging, multi-stage MCQ questions based on ego-centric videos of human interactions.<n>The normative actions encompass seven categories: safety, privacy, proxemics, politeness, cooperation, coordination/proactivity, and communication/legibility.
Score: 52.87904722234434
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Human activity is moderated by norms. However, machines are often trained without explicit supervision on norm understanding and reasoning, particularly when norms are physically- or socially-grounded. To improve and evaluate the normative reasoning capability of vision-language models (VLMs), we present \dataset{} $\|\epsilon\|$, consisting of 1,853 challenging, multi-stage MCQ questions based on ego-centric videos of human interactions, evaluating both the prediction and justification of normative actions. The normative actions encompass seven categories: safety, privacy, proxemics, politeness, cooperation, coordination/proactivity, and communication/legibility. To compile this dataset at scale, we propose a novel pipeline leveraging video sampling, automatic answer generation, filtering, and human validation. Our work demonstrates that current state-of-the-art vision-language models lack robust norm understanding, scoring a maximum of 54\% on \dataset{} (versus a human bench of 92\%). Our analysis of performance in each dimension highlights the significant risks of safety, privacy, and the lack of collaboration and communication capability when applied to real-world agents. We additionally show that through a retrieval-based generation (RAG) method, it is possible to use \dataset{} to enhance normative reasoning in VLMs.

Related papers

When Ethics and Payoffs Diverge: LLM Agents in Morally Charged Social Dilemmas [68.79830818369683]
Large language models (LLMs) have enabled their use in complex agentic roles, involving decision-making with humans or other agents.<n>Recent advances in large language models (LLMs) have enabled their use in complex agentic roles, involving decision-making with humans or other agents.<n>There is limited understanding of how they act when moral imperatives directly conflict with rewards or incentives.<n>We introduce Moral Behavior in Social Dilemma Simulation (MoralSim) and evaluate how LLMs behave in the prisoner's dilemma and public goods game with morally charged contexts.
arXiv Detail & Related papers (2025-05-25T16:19:24Z)
Gricean Norms as a Basis for Effective Collaboration [12.92528740921513]
We propose a normative framework that integrates Gricean norms and cognitive frameworks into large language model (LLM) based agents.<n>Within this framework, we introduce Lamoids, GPT-4 powered agents designed to collaborate with humans.
arXiv Detail & Related papers (2025-03-18T17:54:14Z)
HA-VLN: A Benchmark for Human-Aware Navigation in Discrete-Continuous Environments with Dynamic Multi-Human Interactions, Real-World Validation, and an Open Leaderboard [63.54109142085327]
Vision-and-Language Navigation (VLN) systems often focus on either discrete (panoramic) or continuous (free-motion) paradigms alone. We introduce a unified Human-Aware VLN benchmark that merges these paradigms under explicit social-awareness constraints.
arXiv Detail & Related papers (2025-03-18T13:05:55Z)
Relational Norms for Human-AI Cooperation [3.8608750807106977]
How we interact with social artificial intelligence depends on the socio-relational role the AI is meant to emulate or occupy.<n>Our analysis explores how differences between AI systems and humans, such as the absence of conscious experience and immunity to fatigue, may affect an AI's capacity to fulfill relationship-specific functions.
arXiv Detail & Related papers (2025-02-17T18:23:29Z)
A Grounded Observer Framework for Establishing Guardrails for Foundation Models in Socially Sensitive Domains [1.9116784879310025]
Given the complexities of foundation models, traditional techniques for constraining agent behavior cannot be directly applied. We propose a grounded observer framework for constraining foundation model behavior that offers both behavioral guarantees and real-time variability.
arXiv Detail & Related papers (2024-12-23T22:57:05Z)
ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models [30.301864398780648]
We introduce a novel moral judgment approach called textitEthic that leverages LLMs' reasoning ability and contrastive learning to uncover relevant social norms.<n>Our method outperforms state-of-the-art approaches in moral judgment tasks.
arXiv Detail & Related papers (2024-12-17T12:22:44Z)
MACAROON: Training Vision-Language Models To Be Your Engaged Partners [95.32771929749514]
Large vision-language models (LVLMs) generate detailed responses even when questions are ambiguous or unlabeled. In this study, we aim to shift LVLMs from passive answer providers to proactive engaged partners. We introduce MACAROON, self-iMaginAtion for ContrAstive pReference OptimizatiON, which instructs LVLMs to autonomously generate contrastive response pairs for unlabeled questions.
arXiv Detail & Related papers (2024-06-20T09:27:33Z)
Emergence of Social Norms in Generative Agent Societies: Principles and Architecture [8.094425852451643]
We propose a novel architecture, named CRSEC, to empower the emergence of social norms within generative MASs. Our architecture consists of four modules: Creation & Representation, Spreading, Evaluation, and Compliance. Our experiments demonstrate the capability of our architecture to establish social norms and reduce social conflicts within generative MASs.
arXiv Detail & Related papers (2024-03-13T05:08:10Z)
Agent Alignment in Evolving Social Norms [65.45423591744434]
We propose an evolutionary framework for agent evolution and alignment, named EvolutionaryAgent. In an environment where social norms continuously evolve, agents better adapted to the current social norms will have a higher probability of survival and proliferation. We show that EvolutionaryAgent can align progressively better with the evolving social norms while maintaining its proficiency in general tasks.
arXiv Detail & Related papers (2024-01-09T15:44:44Z)
SALMON: Self-Alignment with Instructable Reward Models [80.83323636730341]
This paper presents a novel approach, namely SALMON, to align base language models with minimal human supervision. We develop an AI assistant named Dromedary-2 with only 6 exemplars for in-context learning and 31 human-defined principles.
arXiv Detail & Related papers (2023-10-09T17:56:53Z)
Heterogeneous Value Alignment Evaluation for Large Language Models [91.96728871418]
Large Language Models (LLMs) have made it crucial to align their values with those of humans. We propose a Heterogeneous Value Alignment Evaluation (HVAE) system to assess the success of aligning LLMs with heterogeneous values.
arXiv Detail & Related papers (2023-05-26T02:34:20Z)
Value Engineering for Autonomous Agents [3.6130723421895947]
Previous approaches have treated values as labels associated with some actions or states of the world, rather than as integral components of agent reasoning. We propose a new AMA paradigm grounded in moral and social psychology, where values are instilled into agents as context-dependent goals. We argue that this type of normative reasoning, where agents are endowed with an understanding of norms' moral implications, leads to value-awareness in autonomous agents.
arXiv Detail & Related papers (2023-02-17T08:52:15Z)
EgoTaskQA: Understanding Human Tasks in Egocentric Videos [89.9573084127155]
EgoTaskQA benchmark provides home for crucial dimensions of task understanding through question-answering on real-world egocentric videos. We meticulously design questions that target the understanding of (1) action dependencies and effects, (2) intents and goals, and (3) agents' beliefs about others. We evaluate state-of-the-art video reasoning models on our benchmark and show their significant gaps between humans in understanding complex goal-oriented egocentric videos.
arXiv Detail & Related papers (2022-10-08T05:49:05Z)
When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment [96.77970239683475]
AI systems need to be able to understand, interpret and predict human moral judgments and decisions. A central challenge for AI safety is capturing the flexibility of the human moral mind. We present a novel challenge set consisting of rule-breaking question answering.
arXiv Detail & Related papers (2022-10-04T09:04:27Z)
Aligning to Social Norms and Values in Interactive Narratives [89.82264844526333]
We focus on creating agents that act in alignment with socially beneficial norms and values in interactive narratives or text-based games. We introduce the GALAD agent that uses the social commonsense knowledge present in specially trained language models to contextually restrict its action space to only those actions that are aligned with socially beneficial values.
arXiv Detail & Related papers (2022-05-04T09:54:33Z)
Training Value-Aligned Reinforcement Learning Agents Using a Normative Prior [10.421378728492437]
It is increasingly a prospect that an agent trained to perform a task optimally, using only a measure of task performance as feedback, can violate societal norms for acceptable behavior or cause harm. We introduce an approach to value-aligned reinforcement learning, in which we train an agent with two reward signals: a standard task performance reward, plus a normative behavior reward. We show how variations on a policy shaping technique can balance these two sources of reward and produce policies that are both effective and perceived as being more normative.
arXiv Detail & Related papers (2021-04-19T17:33:07Z)
Moral Stories: Situated Reasoning about Norms, Intents, Actions, and their Consequences [36.884156839960184]
We investigate whether contemporary NLG models can function as behavioral priors for systems deployed in social settings. We introduce 'Moral Stories', a crowd-sourced dataset of structured, branching narratives for the study of grounded, goal-oriented social reasoning.
arXiv Detail & Related papers (2020-12-31T17:28:01Z)
COSMO: Conditional SEQ2SEQ-based Mixture Model for Zero-Shot Commonsense Question Answering [50.65816570279115]
Identification of the implicit causes and effects of a social context is the driving capability which can enable machines to perform commonsense reasoning. Current approaches in this realm lack the ability to perform commonsense reasoning upon facing an unseen situation. We present Conditional SEQ2SEQ-based Mixture model (COSMO), which provides us with the capabilities of dynamic and diverse content generation.
arXiv Detail & Related papers (2020-11-02T07:08:19Z)
Improving Confidence in the Estimation of Values and Norms [3.8323580808203785]
This paper analyses to what extent an AA is able to estimate the values and norms of a simulated human agent based on its actions in the ultimatum game. We present two methods to reduce ambiguity in profiling the SHAs: one based on search space exploration and another based on counterfactual analysis.
arXiv Detail & Related papers (2020-04-02T15:03:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.