Measuring What Matters: Connecting AI Ethics Evaluations to System Attributes, Hazards, and Harms
- URL: http://arxiv.org/abs/2510.10339v1
- Date: Sat, 11 Oct 2025 20:52:21 GMT
- Title: Measuring What Matters: Connecting AI Ethics Evaluations to System Attributes, Hazards, and Harms
- Authors: Shalaleh Rismani, Renee Shelby, Leah Davis, Negar Rostamzadeh, AJung Moon,
- Abstract summary: Over the past decade, an ecosystem of measures has emerged to evaluate the social and ethical implications of AI systems.<n>Most measures focus on four principles - fairness, transparency, privacy, and trust - and primarily assess model or output system components.<n>Few measures account for interactions across system elements, and only a narrow set of hazards is typically considered for each harm type.
- Score: 8.324216212301033
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Over the past decade, an ecosystem of measures has emerged to evaluate the social and ethical implications of AI systems, largely shaped by high-level ethics principles. These measures are developed and used in fragmented ways, without adequate attention to how they are situated in AI systems. In this paper, we examine how existing measures used in the computing literature map to AI system components, attributes, hazards, and harms. Our analysis draws on a scoping review resulting in nearly 800 measures corresponding to 11 AI ethics principles. We find that most measures focus on four principles - fairness, transparency, privacy, and trust - and primarily assess model or output system components. Few measures account for interactions across system elements, and only a narrow set of hazards is typically considered for each harm type. Many measures are disconnected from where harm is experienced and lack guidance for setting meaningful thresholds. These patterns reveal how current evaluation practices remain fragmented, measuring in pieces rather than capturing how harms emerge across systems. Framing measures with respect to system attributes, hazards, and harms can strengthen regulatory oversight, support actionable practices in industry, and ground future research in systems-level understanding.
Related papers
- In Quest of an Extensible Multi-Level Harm Taxonomy for Adversarial AI: Heart of Security, Ethical Risk Scoring and Resilience Analytics [0.0]
Harm is invoked everywhere from cybersecurity, ethics, risk analysis, to adversarial AI.<n>Current discourse relies on vague, under specified notions of harm, rendering nuanced, structured, and qualitative assessment effectively impossible.<n>We introduce a structured and expandable taxonomy of harms, grounded in an ensemble of contemporary ethical theories.
arXiv Detail & Related papers (2026-01-23T17:44:05Z) - AI Deception: Risks, Dynamics, and Controls [153.71048309527225]
This project provides a comprehensive and up-to-date overview of the AI deception field.<n>We identify a formal definition of AI deception, grounded in signaling theory from studies of animal deception.<n>We organize the landscape of AI deception research as a deception cycle, consisting of two key components: deception emergence and deception treatment.
arXiv Detail & Related papers (2025-11-27T16:56:04Z) - ANNIE: Be Careful of Your Robots [48.89876809734855]
We present the first systematic study of adversarial safety attacks on embodied AI systems.<n>We show attack success rates exceeding 50% across all safety categories.<n>Results expose a previously underexplored but highly consequential attack surface in embodied AI systems.
arXiv Detail & Related papers (2025-09-03T15:00:28Z) - Safety by Measurement: A Systematic Literature Review of AI Safety Evaluation Methods [0.0]
This literature review consolidates the rapidly evolving field of AI safety evaluations.<n>It proposes a systematic taxonomy around three dimensions: what properties we measure, how we measure them, and how these measurements integrate into frameworks.
arXiv Detail & Related papers (2025-05-08T16:55:07Z) - AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons [62.374792825813394]
This paper introduces AILuminate v1.0, the first comprehensive industry-standard benchmark for assessing AI-product risk and reliability.<n>The benchmark evaluates an AI system's resistance to prompts designed to elicit dangerous, illegal, or undesirable behavior in 12 hazard categories.
arXiv Detail & Related papers (2025-02-19T05:58:52Z) - A Shared Standard for Valid Measurement of Generative AI Systems' Capabilities, Risks, and Impacts [38.66213773948168]
The valid measurement of generative AI (GenAI) systems' capabilities, risks, and impacts forms the bedrock of our ability to evaluate these systems.<n>We introduce a shared standard for valid measurement that helps place many of the disparate-seeming evaluation practices in use today on a common footing.
arXiv Detail & Related papers (2024-12-02T19:50:00Z) - Evaluating Generative AI Systems is a Social Science Measurement Challenge [78.35388859345056]
We present a framework for measuring concepts related to the capabilities, impacts, opportunities, and risks of GenAI systems.
The framework distinguishes between four levels: the background concept, the systematized concept, the measurement instrument(s), and the instance-level measurements themselves.
arXiv Detail & Related papers (2024-11-17T02:35:30Z) - From Silos to Systems: Process-Oriented Hazard Analysis for AI Systems [2.226040060318401]
We translate System Theoretic Process Analysis (STPA) for analyzing AI operation and development processes.
We focus on systems that rely on machine learning algorithms and conductedA on three case studies.
We find that key concepts and steps of conducting anA readily apply, albeit with a few adaptations tailored for AI systems.
arXiv Detail & Related papers (2024-10-29T20:43:18Z) - SafetyAnalyst: Interpretable, Transparent, and Steerable Safety Moderation for AI Behavior [56.10557932893919]
We present SafetyAnalyst, a novel AI safety moderation framework.<n>Given an AI behavior, SafetyAnalyst uses chain-of-thought reasoning to analyze its potential consequences.<n>It aggregates effects into a harmfulness score using 28 fully interpretable weight parameters.
arXiv Detail & Related papers (2024-10-22T03:38:37Z) - A Conceptual Framework for Ethical Evaluation of Machine Learning Systems [12.887834116390358]
Ethical implications appear when designing evaluations of machine learning systems.
We present a utility framework, characterizing the key trade-off in ethical evaluation as balancing information gain against potential ethical harms.
Our analysis underscores the critical need for development teams to deliberately assess and manage ethical complexities.
arXiv Detail & Related papers (2024-08-05T01:06:49Z) - Overcoming Failures of Imagination in AI Infused System Development and
Deployment [71.9309995623067]
NeurIPS 2020 requested that research paper submissions include impact statements on "potential nefarious uses and the consequences of failure"
We argue that frameworks of harms must be context-aware and consider a wider range of potential stakeholders, system affordances, as well as viable proxies for assessing harms in the widest sense.
arXiv Detail & Related papers (2020-11-26T18:09:52Z) - Steps Towards Value-Aligned Systems [0.0]
Algorithmic (including AI/ML) decision-making artifacts are an established and growing part of our decision-making ecosystem.
Current literature is full of examples of how individual artifacts violate societal norms and expectations.
This discussion argues for a more structured systems-level approach for assessing value-alignment in sociotechnical systems.
arXiv Detail & Related papers (2020-02-10T22:47:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.