Related papers: Who Should Run Advanced AI Evaluations -- AISIs?

Who Should Run Advanced AI Evaluations -- AISIs?

URL: http://arxiv.org/abs/2407.20847v3
Date: Mon, 04 Aug 2025 19:46:05 GMT
Title: Who Should Run Advanced AI Evaluations -- AISIs?
Authors: Merlin Stein, Milan Gandhi, Theresa Kriecherbauer, Amin Oueslati, Robert Trager,
Abstract summary: Safety Institutes and governments worldwide are deciding whether they evaluate advanced AI themselves, support a private evaluation ecosystem or do both.<n> Evaluation is a necessary governance tool to understand and manage the risks of a technology.<n>This paper draws from nine such regimes to inform (i) who should evaluate which parts of advanced AI; and (ii) how much capacity public bodies may need to evaluate advanced AI effectively.
Score: 0.5573180584719433
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Artificial Intelligence (AI) Safety Institutes and governments worldwide are deciding whether they evaluate advanced AI themselves, support a private evaluation ecosystem or do both. Evaluation regimes have been established in a wide range of industry contexts to monitor and evaluate firms' compliance with regulation. Evaluation is a necessary governance tool to understand and manage the risks of a technology. This paper draws from nine such regimes to inform (i) who should evaluate which parts of advanced AI; and (ii) how much capacity public bodies may need to evaluate advanced AI effectively. First, the effective responsibility distribution between public and private evaluators depends heavily on specific industry and evaluation conditions. On the basis of advanced AI's risk profile, the sensitivity of information involved in the evaluation process, and the high costs of verifying safety and benefit claims of AI Labs, we recommend that public bodies become directly involved in safety critical, especially gray- and white-box, AI model evaluations. Governance and security audits, which are well-established in other industry contexts, as well as black-box model evaluations, may be more efficiently provided by a private market of evaluators and auditors under public oversight. Secondly, to effectively fulfil their role in advanced AI audits, public bodies need extensive access to models and facilities. AISI's capacity should scale with the industry's risk level, size and market concentration, potentially requiring 100s of employees for evaluations in large jurisdictions like the EU or US, like in nuclear safety and life sciences.

Related papers

Frontier AI Auditing: Toward Rigorous Third-Party Assessment of Safety and Security Practices at Leading AI Companies [57.521647436515785]
We define frontier AI auditing as rigorous third-party verification of frontier AI developers' safety and security claims.<n>We introduce AI Assurance Levels (AAL-1 to AAL-4), ranging from time-bounded system audits to continuous, deception-resilient verification.
arXiv Detail & Related papers (2026-01-16T18:44:09Z)
Trust and Transparency in AI: Industry Voices on Data, Ethics, and Compliance [0.7099737083842057]
The rapid adoption of AI in the industry has outpaced ethical evaluation frameworks.<n>This paper investigates practical approaches and challenges in the development and assessment of Trustworthy AI.
arXiv Detail & Related papers (2025-09-23T20:58:01Z)
Never Compromise to Vulnerabilities: A Comprehensive Survey on AI Governance [211.5823259429128]
We propose a comprehensive framework integrating technical and societal dimensions, structured around three interconnected pillars: Intrinsic Security, Derivative Security, and Social Ethics.<n>We identify three core challenges: (1) the generalization gap, where defenses fail against evolving threats; (2) inadequate evaluation protocols that overlook real-world risks; and (3) fragmented regulations leading to inconsistent oversight.<n>Our framework offers actionable guidance for researchers, engineers, and policymakers to develop AI systems that are not only robust and secure but also ethically aligned and publicly trustworthy.
arXiv Detail & Related papers (2025-08-12T09:42:56Z)
In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI [93.33036653316591]
We call for three interventions to advance system safety. First, we propose using standardized AI flaw reports and rules of engagement for researchers. Second, we propose GPAI system providers adopt broadly-scoped flaw disclosure programs. Third, we advocate for the development of improved infrastructure to coordinate distribution of flaw reports.
arXiv Detail & Related papers (2025-03-21T05:09:46Z)
AI Companies Should Report Pre- and Post-Mitigation Safety Evaluations [5.984437476321095]
frontier AI companies should report both pre- and post-mitigation safety evaluations.<n> evaluating models at both stages provides policymakers with essential evidence to regulate deployment, access, and safety standards.
arXiv Detail & Related papers (2025-03-17T17:56:43Z)
Media and responsible AI governance: a game-theoretic and LLM analysis [61.132523071109354]
This paper investigates the interplay between AI developers, regulators, users, and the media in fostering trustworthy AI systems. Using evolutionary game theory and large language models (LLMs), we model the strategic interactions among these actors under different regulatory regimes.
arXiv Detail & Related papers (2025-03-12T21:39:38Z)
Securing External Deeper-than-black-box GPAI Evaluations [49.1574468325115]
This paper examines the critical challenges and potential solutions for conducting secure and effective external evaluations of general-purpose AI (GPAI) models.<n>With the exponential growth in size, capability, reach and accompanying risk, ensuring accountability, safety, and public trust requires frameworks that go beyond traditional black-box methods.
arXiv Detail & Related papers (2025-03-10T16:13:45Z)
AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons [62.374792825813394]
This paper introduces AILuminate v1.0, the first comprehensive industry-standard benchmark for assessing AI-product risk and reliability.<n>The benchmark evaluates an AI system's resistance to prompts designed to elicit dangerous, illegal, or undesirable behavior in 12 hazard categories.
arXiv Detail & Related papers (2025-02-19T05:58:52Z)
Fully Autonomous AI Agents Should Not be Developed [58.88624302082713]
This paper argues that fully autonomous AI agents should not be developed.<n>In support of this position, we build from prior scientific literature and current product marketing to delineate different AI agent levels.<n>Our analysis reveals that risks to people increase with the autonomy of a system.
arXiv Detail & Related papers (2025-02-04T19:00:06Z)
Position: A taxonomy for reporting and describing AI security incidents [57.98317583163334]
We argue that specific are required to describe and report security incidents of AI systems. Existing frameworks for either non-AI security or generic AI safety incident reporting are insufficient to capture the specific properties of AI security.
arXiv Detail & Related papers (2024-12-19T13:50:26Z)
Declare and Justify: Explicit assumptions in AI evaluations are necessary for effective regulation [2.07180164747172]
We argue that regulation should require developers to explicitly identify and justify key underlying assumptions about evaluations. We identify core assumptions in AI evaluations, such as comprehensive threat modeling, proxy task validity, and adequate capability elicitation. Our presented approach aims to enhance transparency in AI development, offering a practical path towards more effective governance of advanced AI systems.
arXiv Detail & Related papers (2024-11-19T19:13:56Z)
Engineering Trustworthy AI: A Developer Guide for Empirical Risk Minimization [53.80919781981027]
Key requirements for trustworthy AI can be translated into design choices for the components of empirical risk minimization. We hope to provide actionable guidance for building AI systems that meet emerging standards for trustworthiness of AI.
arXiv Detail & Related papers (2024-10-25T07:53:32Z)
From Transparency to Accountability and Back: A Discussion of Access and Evidence in AI Auditing [1.196505602609637]
Audits can take many forms, including pre-deployment risk assessments, ongoing monitoring, and compliance testing. There are many operational challenges to AI auditing that complicate its implementation. We argue that auditing can be cast as a natural hypothesis test, draw parallels hypothesis testing and legal procedure, and argue that this framing provides clear and interpretable guidance on audit implementation.
arXiv Detail & Related papers (2024-10-07T06:15:46Z)
Open Problems in Technical AI Governance [93.89102632003996]
Technical AI governance refers to technical analysis and tools for supporting the effective governance of AI. This paper is intended as a resource for technical researchers or research funders looking to contribute to AI governance.
arXiv Detail & Related papers (2024-07-20T21:13:56Z)
Auditing of AI: Legal, Ethical and Technical Approaches [0.0]
AI auditing is a rapidly growing field of research and practice. Different approaches to AI auditing have different affordances and constraints. The next step in the evolution of auditing as an AI governance mechanism should be the interlinking of these available approaches.
arXiv Detail & Related papers (2024-07-07T12:49:58Z)
The Necessity of AI Audit Standards Boards [0.0]
We argue that creating auditing standards is not just insufficient, but actively harmful by proliferating unheeded and inconsistent standards. Instead, the paper proposes the establishment of an AI Audit Standards Board, responsible for developing and updating auditing methods and standards.
arXiv Detail & Related papers (2024-04-11T15:08:24Z)
A Safe Harbor for AI Evaluation and Red Teaming [124.89885800509505]
Some researchers fear that conducting such research or releasing their findings will result in account suspensions or legal reprisal. We propose that major AI developers commit to providing a legal and technical safe harbor. We believe these commitments are a necessary step towards more inclusive and unimpeded community efforts to tackle the risks of generative AI.
arXiv Detail & Related papers (2024-03-07T20:55:08Z)
Testing autonomous vehicles and AI: perspectives and challenges from cybersecurity, transparency, robustness and fairness [53.91018508439669]
The study explores the complexities of integrating Artificial Intelligence into Autonomous Vehicles (AVs) It examines the challenges introduced by AI components and the impact on testing procedures. The paper identifies significant challenges and suggests future directions for research and development of AI in AV technology.
arXiv Detail & Related papers (2024-02-21T08:29:42Z)
The risks of risk-based AI regulation: taking liability seriously [46.90451304069951]
The development and regulation of AI seems to have reached a critical stage. Some experts are calling for a moratorium on the training of AI systems more powerful than GPT-4. This paper analyses the most advanced legal proposal, the European Union's AI Act.
arXiv Detail & Related papers (2023-11-03T12:51:37Z)
Who Audits the Auditors? Recommendations from a field scan of the algorithmic auditing ecosystem [0.971392598996499]
We provide the first comprehensive field scan of the AI audit ecosystem. We identify emerging best practices as well as methods and tools that are becoming commonplace. We outline policy recommendations to improve the quality and impact of these audits.
arXiv Detail & Related papers (2023-10-04T01:40:03Z)
Guideline for Trustworthy Artificial Intelligence -- AI Assessment Catalog [0.0]
It is clear that AI and business models based on it can only reach their full potential if AI applications are developed according to high quality standards. The issue of the trustworthiness of AI applications is crucial and is the subject of numerous major publications. This AI assessment catalog addresses exactly this point and is intended for two target groups.
arXiv Detail & Related papers (2023-06-20T08:07:18Z)
Quantitative AI Risk Assessments: Opportunities and Challenges [7.35411010153049]
Best way to reduce risks is to implement comprehensive AI lifecycle governance.<n>Risks can be quantified using metrics from the technical community.<n>This paper explores these issues, focusing on the opportunities, challenges, and potential impacts of such an approach.
arXiv Detail & Related papers (2022-09-13T21:47:25Z)
Audit and Assurance of AI Algorithms: A framework to ensure ethical algorithmic practices in Artificial Intelligence [0.0]
U.S. lacks strict legislative prohibitions or specified protocols for measuring damages. From autonomous vehicles and banking to medical care, housing, and legal decisions, there will soon be enormous amounts of algorithms. Governments, businesses, and society would have an algorithm audit, which would have systematic verification that algorithms are lawful, ethical, and secure.
arXiv Detail & Related papers (2021-07-14T15:16:40Z)
Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims [59.64274607533249]
AI developers need to make verifiable claims to which they can be held accountable. This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems. We analyze ten mechanisms for this purpose--spanning institutions, software, and hardware--and make recommendations aimed at implementing, exploring, or improving those mechanisms.
arXiv Detail & Related papers (2020-04-15T17:15:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.