Lost in Vagueness: Towards Context-Sensitive Standards for Robustness Assessment under the EU AI Act
- URL: http://arxiv.org/abs/2511.15620v1
- Date: Wed, 19 Nov 2025 17:06:36 GMT
- Title: Lost in Vagueness: Towards Context-Sensitive Standards for Robustness Assessment under the EU AI Act
- Authors: Roberta Tamponi, Carina Prunkl, Thomas Bäck, Anna V. Kononova,
- Abstract summary: Robustness is a key requirement for high-risk AI systems under the EU Artificial Intelligence Act (AI Act)<n>This paper investigates what it means for AI systems to be robust and illustrates the need for context-sensitive standardisation.
- Score: 2.740981829798319
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robustness is a key requirement for high-risk AI systems under the EU Artificial Intelligence Act (AI Act). However, both its definition and assessment methods remain underspecified, leaving providers with little concrete direction on how to demonstrate compliance. This stems from the Act's horizontal approach, which establishes general obligations applicable across all AI systems, but leaves the task of providing technical guidance to harmonised standards. This paper investigates what it means for AI systems to be robust and illustrates the need for context-sensitive standardisation. We argue that robustness is not a fixed property of a system, but depends on which aspects of performance are expected to remain stable ("robustness of what"), the perturbations the system must withstand ("robustness to what") and the operational environment. We identify three contextual drivers--use case, data and model--that shape the relevant perturbations and influence the choice of tests, metrics and benchmarks used to evaluate robustness. The need to provide at least a range of technical options that providers can assess and implement in light of the system's purpose is explicitly recognised by the standardisation request for the AI Act, but planned standards, still focused on horizontal coverage, do not yet offer this level of detail. Building on this, we propose a context-sensitive multi-layered standardisation framework where horizontal standards set common principles and terminology, while domain-specific ones identify risks across the AI lifecycle and guide appropriate practices, organised in a dynamic repository where providers can propose new informative methods and share lessons learned. Such a system reduces the interpretative burden, mitigates arbitrariness and addresses the obsolescence of static standards, ensuring that robustness assessment is both adaptable and operationally meaningful.
Related papers
- Standards for trustworthy AI in the European Union: technical rationale, structural challenges, and an implementation path [0.0]
This white paper examines the technical foundations of European AI standardization under the AI Act.<n>It explains how harmonized standards enable the presumption of conformity mechanism, describes the CEN/CENELEC standardization process, and analyzes why AI poses unique challenges.
arXiv Detail & Related papers (2026-01-21T11:58:47Z) - Frontier AI Auditing: Toward Rigorous Third-Party Assessment of Safety and Security Practices at Leading AI Companies [57.521647436515785]
We define frontier AI auditing as rigorous third-party verification of frontier AI developers' safety and security claims.<n>We introduce AI Assurance Levels (AAL-1 to AAL-4), ranging from time-bounded system audits to continuous, deception-resilient verification.
arXiv Detail & Related papers (2026-01-16T18:44:09Z) - Variance-Bounded Evaluation of Entity-Centric AI Systems Without Ground Truth: Theory and Measurement [0.0]
We introduce VB-Score, a variance-bounded evaluation framework for entity-centric AI systems.<n> VB-Score enumerates plausible interpretations through constraint relaxation and Monte Carlo sampling.<n>It then evaluates system outputs by their expected success across interpretations, penalized by variance to assess robustness of the system.
arXiv Detail & Related papers (2025-09-26T07:54:38Z) - Safe and Certifiable AI Systems: Concepts, Challenges, and Lessons Learned [45.44933002008943]
This white paper presents the T"UV AUSTRIA Trusted AI framework.<n>It is an end-to-end audit catalog and methodology for assessing and certifying machine learning systems.<n>Building on three pillars - Secure Software Development, Functional Requirements, and Ethics & Data Privacy - it translates the high-level obligations of the EU AI Act into specific, testable criteria.
arXiv Detail & Related papers (2025-09-08T17:52:08Z) - Rethinking Data Protection in the (Generative) Artificial Intelligence Era [138.07763415496288]
We propose a four-level taxonomy that captures the diverse protection needs arising in modern (generative) AI models and systems.<n>Our framework offers a structured understanding of the trade-offs between data utility and control, spanning the entire AI pipeline.
arXiv Detail & Related papers (2025-07-03T02:45:51Z) - A Practical SAFE-AI Framework for Small and Medium-Sized Enterprises Developing Medical Artificial Intelligence Ethics Policies [0.0]
We introduce the Scalable Agile Framework for Execution in AI (SAFE-AI)<n>SAFE-AI balances ethical rigor with business priorities by embedding ethical oversight into standard Agile-based product development.<n>A core component of this framework are responsibility metrics using scenario-based probability analogy mapping.
arXiv Detail & Related papers (2025-07-02T02:45:26Z) - Watermarking Without Standards Is Not AI Governance [46.71493672772134]
We argue that current implementations risk serving as symbolic compliance rather than delivering effective oversight.<n>We propose a three-layer framework encompassing technical standards, audit infrastructure, and enforcement mechanisms.
arXiv Detail & Related papers (2025-05-27T18:10:04Z) - AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons [62.374792825813394]
This paper introduces AILuminate v1.0, the first comprehensive industry-standard benchmark for assessing AI-product risk and reliability.<n>The benchmark evaluates an AI system's resistance to prompts designed to elicit dangerous, illegal, or undesirable behavior in 12 hazard categories.
arXiv Detail & Related papers (2025-02-19T05:58:52Z) - An Open Knowledge Graph-Based Approach for Mapping Concepts and Requirements between the EU AI Act and International Standards [1.9142148274342772]
The EU's AI Act will shift the focus of such organizations toward conformance with the technical requirements for regulatory compliance.
This paper offers a simple and repeatable mechanism for mapping the terms and requirements relevant to normative statements in regulations and standards.
arXiv Detail & Related papers (2024-08-21T18:21:09Z) - Towards a multi-stakeholder value-based assessment framework for
algorithmic systems [76.79703106646967]
We develop a value-based assessment framework that visualizes closeness and tensions between values.
We give guidelines on how to operationalize them, while opening up the evaluation and deliberation process to a wide range of stakeholders.
arXiv Detail & Related papers (2022-05-09T19:28:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.