MARIA: A Framework for Marginal Risk Assessment without Ground Truth in AI Systems
- URL: http://arxiv.org/abs/2510.27163v1
- Date: Fri, 31 Oct 2025 04:18:20 GMT
- Title: MARIA: A Framework for Marginal Risk Assessment without Ground Truth in AI Systems
- Authors: Jieshan Chen, Suyu Ma, Qinghua Lu, Sung Une Lee, Liming Zhu,
- Abstract summary: Before deploying an AI system to replace an existing process, it must be compared with the incumbent to ensure improvement without added risk.<n>Traditional evaluation relies on ground truth for both systems, but this is often unavailable due to delayed or unknowable outcomes.<n>We propose a marginal risk assessment framework, that avoids dependence on ground truth or absolute risk.
- Score: 11.620099531890716
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Before deploying an AI system to replace an existing process, it must be compared with the incumbent to ensure improvement without added risk. Traditional evaluation relies on ground truth for both systems, but this is often unavailable due to delayed or unknowable outcomes, high costs, or incomplete data, especially for long-standing systems deemed safe by convention. The more practical solution is not to compute absolute risk but the difference between systems. We therefore propose a marginal risk assessment framework, that avoids dependence on ground truth or absolute risk. It emphasizes three kinds of relative evaluation methodology, including predictability, capability and interaction dominance. By shifting focus from absolute to relative evaluation, our approach equips software teams with actionable guidance: identifying where AI enhances outcomes, where it introduces new risks, and how to adopt such systems responsibly.
Related papers
- The Missing Variable: Socio-Technical Alignment in Risk Evaluation [5.539407031861404]
Existing risk evaluation methods fail to account for the associated complex interaction between human, technical, and organizational elements.<n>We introduce a novel socio-technical alignment $STA$ variable designed to be integrated into the foundational risk equation.
arXiv Detail & Related papers (2025-12-06T08:59:14Z) - RADAR: A Risk-Aware Dynamic Multi-Agent Framework for LLM Safety Evaluation via Role-Specialized Collaboration [81.38705556267917]
Existing safety evaluation methods for large language models (LLMs) suffer from inherent limitations.<n>We introduce a theoretical framework that reconstructs the underlying risk concept space.<n>We propose RADAR, a multi-agent collaborative evaluation framework.
arXiv Detail & Related papers (2025-09-28T09:35:32Z) - Never Compromise to Vulnerabilities: A Comprehensive Survey on AI Governance [211.5823259429128]
We propose a comprehensive framework integrating technical and societal dimensions, structured around three interconnected pillars: Intrinsic Security, Derivative Security, and Social Ethics.<n>We identify three core challenges: (1) the generalization gap, where defenses fail against evolving threats; (2) inadequate evaluation protocols that overlook real-world risks; and (3) fragmented regulations leading to inconsistent oversight.<n>Our framework offers actionable guidance for researchers, engineers, and policymakers to develop AI systems that are not only robust and secure but also ethically aligned and publicly trustworthy.
arXiv Detail & Related papers (2025-08-12T09:42:56Z) - Adapting Probabilistic Risk Assessment for AI [0.0]
General-purpose artificial intelligence (AI) systems present an urgent risk management challenge.<n>Current methods often rely on selective testing and undocumented assumptions about risk priorities.<n>This paper introduces the probabilistic risk assessment (PRA) for AI framework.
arXiv Detail & Related papers (2025-04-25T17:59:14Z) - AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons [62.374792825813394]
This paper introduces AILuminate v1.0, the first comprehensive industry-standard benchmark for assessing AI-product risk and reliability.<n>The benchmark evaluates an AI system's resistance to prompts designed to elicit dangerous, illegal, or undesirable behavior in 12 hazard categories.
arXiv Detail & Related papers (2025-02-19T05:58:52Z) - Trustworthiness in Stochastic Systems: Towards Opening the Black Box [1.7355698649527407]
behavior by an AI system threatens to undermine alignment and potential trust.<n>We take a philosophical perspective to the tension and potential conflict between foundationality and trustworthiness.<n>We propose latent value modeling for both AI systems and users to better assess alignment.
arXiv Detail & Related papers (2025-01-27T19:43:09Z) - Engineering Trustworthy AI: A Developer Guide for Empirical Risk Minimization [53.80919781981027]
Key requirements for trustworthy AI can be translated into design choices for the components of empirical risk minimization.
We hope to provide actionable guidance for building AI systems that meet emerging standards for trustworthiness of AI.
arXiv Detail & Related papers (2024-10-25T07:53:32Z) - Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems [88.80306881112313]
We will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI.
The core feature of these approaches is that they aim to produce AI systems which are equipped with high-assurance quantitative safety guarantees.
We outline a number of approaches for creating each of these three core components, describe the main technical challenges, and suggest a number of potential solutions to them.
arXiv Detail & Related papers (2024-05-10T17:38:32Z) - Sociotechnical Safety Evaluation of Generative AI Systems [13.546708226350963]
Generative AI systems produce a range of risks.
To ensure the safety of generative AI systems, these risks must be evaluated.
We propose a three-layered framework that takes a structured, sociotechnical approach to evaluating these risks.
arXiv Detail & Related papers (2023-10-18T14:13:58Z) - A Counterfactual Safety Margin Perspective on the Scoring of Autonomous
Vehicles' Riskiness [52.27309191283943]
This paper presents a data-driven framework for assessing the risk of different AVs' behaviors.
We propose the notion of counterfactual safety margin, which represents the minimum deviation from nominal behavior that could cause a collision.
arXiv Detail & Related papers (2023-08-02T09:48:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.