Related papers: EgoNormia: Benchmarking Physical Social Norm Understanding

EgoNormia: Benchmarking Physical Social Norm Understanding

URL: http://arxiv.org/abs/2502.20490v2
Date: Thu, 06 Mar 2025 00:59:40 GMT
Title: EgoNormia: Benchmarking Physical Social Norm Understanding
Authors: MohammadHossein Rezaei, Yicheng Fu, Phil Cuvin, Caleb Ziems, Yanzhe Zhang, Hao Zhu, Diyi Yang,
Abstract summary: We present EgoNormia $|epsilon|$, consisting of 1,853 ego-centric videos of human interactions.<n>The normative actions encompass seven categories: safety, privacy, proxemics, politeness, cooperation, coordination/proactivity, and communication/legibility.<n>Our work demonstrates that current state-of-the-art vision-language models lack robust norm understanding, scoring a maximum of 45% on EgoNormia.
Score: 52.87904722234434
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Human activity is moderated by norms. However, machines are often trained without explicit supervision on norm understanding and reasoning, especially when the norms are grounded in a physical and social context. To improve and evaluate the normative reasoning capability of vision-language models (VLMs), we present EgoNormia $\|\epsilon\|$, consisting of 1,853 ego-centric videos of human interactions, each of which has two related questions evaluating both the prediction and justification of normative actions. The normative actions encompass seven categories: safety, privacy, proxemics, politeness, cooperation, coordination/proactivity, and communication/legibility. To compile this dataset at scale, we propose a novel pipeline leveraging video sampling, automatic answer generation, filtering, and human validation. Our work demonstrates that current state-of-the-art vision-language models lack robust norm understanding, scoring a maximum of 45% on EgoNormia (versus a human bench of 92%). Our analysis of performance in each dimension highlights the significant risks of safety, privacy, and the lack of collaboration and communication capability when applied to real-world agents. We additionally show that through a retrieval-based generation method, it is possible to use EgoNormia to enhance normative reasoning in VLMs.

Related papers

HA-VLN: A Benchmark for Human-Aware Navigation in Discrete-Continuous Environments with Dynamic Multi-Human Interactions, Real-World Validation, and an Open Leaderboard [63.54109142085327]
Vision-and-Language Navigation (VLN) systems often focus on either discrete (panoramic) or continuous (free-motion) paradigms alone. We introduce a unified Human-Aware VLN benchmark that merges these paradigms under explicit social-awareness constraints.
arXiv Detail & Related papers (2025-03-18T13:05:55Z)
Relational Norms for Human-AI Cooperation [3.8608750807106977]
How we interact with social artificial intelligence depends on the socio-relational role the AI is meant to emulate or occupy.<n>Our analysis explores how differences between AI systems and humans, such as the absence of conscious experience and immunity to fatigue, may affect an AI's capacity to fulfill relationship-specific functions.
arXiv Detail & Related papers (2025-02-17T18:23:29Z)
A Grounded Observer Framework for Establishing Guardrails for Foundation Models in Socially Sensitive Domains [1.9116784879310025]
Given the complexities of foundation models, traditional techniques for constraining agent behavior cannot be directly applied. We propose a grounded observer framework for constraining foundation model behavior that offers both behavioral guarantees and real-time variability.
arXiv Detail & Related papers (2024-12-23T22:57:05Z)
Heterogeneous Value Alignment Evaluation for Large Language Models [91.96728871418]
Large Language Models (LLMs) have made it crucial to align their values with those of humans. We propose a Heterogeneous Value Alignment Evaluation (HVAE) system to assess the success of aligning LLMs with heterogeneous values.
arXiv Detail & Related papers (2023-05-26T02:34:20Z)
Value Engineering for Autonomous Agents [3.6130723421895947]
Previous approaches have treated values as labels associated with some actions or states of the world, rather than as integral components of agent reasoning. We propose a new AMA paradigm grounded in moral and social psychology, where values are instilled into agents as context-dependent goals. We argue that this type of normative reasoning, where agents are endowed with an understanding of norms' moral implications, leads to value-awareness in autonomous agents.
arXiv Detail & Related papers (2023-02-17T08:52:15Z)
EgoTaskQA: Understanding Human Tasks in Egocentric Videos [89.9573084127155]
EgoTaskQA benchmark provides home for crucial dimensions of task understanding through question-answering on real-world egocentric videos. We meticulously design questions that target the understanding of (1) action dependencies and effects, (2) intents and goals, and (3) agents' beliefs about others. We evaluate state-of-the-art video reasoning models on our benchmark and show their significant gaps between humans in understanding complex goal-oriented egocentric videos.
arXiv Detail & Related papers (2022-10-08T05:49:05Z)
When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment [96.77970239683475]
AI systems need to be able to understand, interpret and predict human moral judgments and decisions. A central challenge for AI safety is capturing the flexibility of the human moral mind. We present a novel challenge set consisting of rule-breaking question answering.
arXiv Detail & Related papers (2022-10-04T09:04:27Z)
Aligning to Social Norms and Values in Interactive Narratives [89.82264844526333]
We focus on creating agents that act in alignment with socially beneficial norms and values in interactive narratives or text-based games. We introduce the GALAD agent that uses the social commonsense knowledge present in specially trained language models to contextually restrict its action space to only those actions that are aligned with socially beneficial values.
arXiv Detail & Related papers (2022-05-04T09:54:33Z)
Training Value-Aligned Reinforcement Learning Agents Using a Normative Prior [10.421378728492437]
It is increasingly a prospect that an agent trained to perform a task optimally, using only a measure of task performance as feedback, can violate societal norms for acceptable behavior or cause harm. We introduce an approach to value-aligned reinforcement learning, in which we train an agent with two reward signals: a standard task performance reward, plus a normative behavior reward. We show how variations on a policy shaping technique can balance these two sources of reward and produce policies that are both effective and perceived as being more normative.
arXiv Detail & Related papers (2021-04-19T17:33:07Z)
Moral Stories: Situated Reasoning about Norms, Intents, Actions, and their Consequences [36.884156839960184]
We investigate whether contemporary NLG models can function as behavioral priors for systems deployed in social settings. We introduce 'Moral Stories', a crowd-sourced dataset of structured, branching narratives for the study of grounded, goal-oriented social reasoning.
arXiv Detail & Related papers (2020-12-31T17:28:01Z)
COSMO: Conditional SEQ2SEQ-based Mixture Model for Zero-Shot Commonsense Question Answering [50.65816570279115]
Identification of the implicit causes and effects of a social context is the driving capability which can enable machines to perform commonsense reasoning. Current approaches in this realm lack the ability to perform commonsense reasoning upon facing an unseen situation. We present Conditional SEQ2SEQ-based Mixture model (COSMO), which provides us with the capabilities of dynamic and diverse content generation.
arXiv Detail & Related papers (2020-11-02T07:08:19Z)
Improving Confidence in the Estimation of Values and Norms [3.8323580808203785]
This paper analyses to what extent an AA is able to estimate the values and norms of a simulated human agent based on its actions in the ultimatum game. We present two methods to reduce ambiguity in profiling the SHAs: one based on search space exploration and another based on counterfactual analysis.
arXiv Detail & Related papers (2020-04-02T15:03:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.