Verifying International Agreements on AI: Six Layers of Verification for Rules on Large-Scale AI Development and Deployment
- URL: http://arxiv.org/abs/2507.15916v2
- Date: Fri, 25 Jul 2025 17:45:17 GMT
- Title: Verifying International Agreements on AI: Six Layers of Verification for Rules on Large-Scale AI Development and Deployment
- Authors: Mauricio Baker, Gabriel Kulp, Oliver Marks, Miles Brundage, Lennart Heim,
- Abstract summary: This report provides an in-depth overview of AI verification, intended for both policy professionals and technical researchers.<n>We present novel conceptual frameworks, detailed implementation options, and key R&D challenges.<n>We find that states could eventually verify compliance by using six largely independent verification approaches.
- Score: 0.7364983833280243
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The risks of frontier AI may require international cooperation, which in turn may require verification: checking that all parties follow agreed-on rules. For instance, states might need to verify that powerful AI models are widely deployed only after their risks to international security have been evaluated and deemed manageable. However, research on AI verification could benefit from greater clarity and detail. To address this, this report provides an in-depth overview of AI verification, intended for both policy professionals and technical researchers. We present novel conceptual frameworks, detailed implementation options, and key R&D challenges. These draw on existing literature, expert interviews, and original analysis, all within the scope of confidentially overseeing AI development and deployment that uses thousands of high-end AI chips. We find that states could eventually verify compliance by using six largely independent verification approaches with substantial redundancy: (1) built-in security features in AI chips; (2-3) separate monitoring devices attached to AI chips; and (4-6) personnel-based mechanisms, such as whistleblower programs. While promising, these approaches require guardrails to protect against abuse and power concentration, and many of these technologies have yet to be built or stress-tested. To enable states to confidently verify compliance with rules on large-scale AI development and deployment, the R&D challenges we list need significant progress.
Related papers
- Frontier AI Auditing: Toward Rigorous Third-Party Assessment of Safety and Security Practices at Leading AI Companies [57.521647436515785]
We define frontier AI auditing as rigorous third-party verification of frontier AI developers' safety and security claims.<n>We introduce AI Assurance Levels (AAL-1 to AAL-4), ranging from time-bounded system audits to continuous, deception-resilient verification.
arXiv Detail & Related papers (2026-01-16T18:44:09Z) - International AI Safety Report 2025: Second Key Update: Technical Safeguards and Risk Management [115.92752850425272]
Second update to the 2025 International AI Safety Report assesses new developments in general-purpose AI risk management over the past year.<n> examines how researchers, public institutions, and AI developers are approaching risk management for general-purpose AI.
arXiv Detail & Related papers (2025-11-25T03:12:56Z) - International AI Safety Report 2025: First Key Update: Capabilities and Risk Implications [118.49965571969089]
This update examines how AI capabilities have improved since the first AI Safety Report.<n>It focuses on key risk areas where substantial new evidence warrants updated assessments.
arXiv Detail & Related papers (2025-10-15T15:13:49Z) - Never Compromise to Vulnerabilities: A Comprehensive Survey on AI Governance [211.5823259429128]
We propose a comprehensive framework integrating technical and societal dimensions, structured around three interconnected pillars: Intrinsic Security, Derivative Security, and Social Ethics.<n>We identify three core challenges: (1) the generalization gap, where defenses fail against evolving threats; (2) inadequate evaluation protocols that overlook real-world risks; and (3) fragmented regulations leading to inconsistent oversight.<n>Our framework offers actionable guidance for researchers, engineers, and policymakers to develop AI systems that are not only robust and secure but also ethically aligned and publicly trustworthy.
arXiv Detail & Related papers (2025-08-12T09:42:56Z) - The Singapore Consensus on Global AI Safety Research Priorities [128.58674892183657]
"2025 Singapore Conference on AI (SCAI): International Scientific Exchange on AI Safety" aimed to support research in this space.<n>Report builds on the International AI Safety Report chaired by Yoshua Bengio and backed by 33 governments.<n>Report organises AI safety research domains into three types: challenges with creating trustworthy AI systems (Development), challenges with evaluating their risks (Assessment) and challenges with monitoring and intervening after deployment (Control)
arXiv Detail & Related papers (2025-06-25T17:59:50Z) - Mechanisms to Verify International Agreements About AI Development [0.0]
Report aims to demonstrate how countries could practically verify claims about each other's AI development and deployment.<n>The focus is on international agreements and state-involved AI development, but these approaches could also be applied to domestic regulation of companies.
arXiv Detail & Related papers (2025-06-18T20:28:54Z) - Compliance of AI Systems [0.0]
This paper systematically examines the compliance of AI systems with relevant legislation, focusing on the EU's AI Act.<n>The analysis highlighted many challenges associated with edge devices, which are increasingly being used to deploy AI applications closer and closer to the data sources.<n>The importance of data set compliance is highlighted as a cornerstone for ensuring the trustworthiness, transparency, and explainability of AI systems.
arXiv Detail & Related papers (2025-03-07T16:53:36Z) - Position: Mind the Gap-the Growing Disconnect Between Established Vulnerability Disclosure and AI Security [56.219994752894294]
We argue that adapting existing processes for AI security reporting is doomed to fail due to fundamental shortcomings for the distinctive characteristics of AI systems.<n>Based on our proposal to address these shortcomings, we discuss an approach to AI security reporting and how the new AI paradigm, AI agents, will further reinforce the need for specialized AI security incident reporting advancements.
arXiv Detail & Related papers (2024-12-19T13:50:26Z) - Using AI Alignment Theory to understand the potential pitfalls of regulatory frameworks [55.2480439325792]
This paper critically examines the European Union's Artificial Intelligence Act (EU AI Act)
Uses insights from Alignment Theory (AT) research, which focuses on the potential pitfalls of technical alignment in Artificial Intelligence.
As we apply these concepts to the EU AI Act, we uncover potential vulnerabilities and areas for improvement in the regulation.
arXiv Detail & Related papers (2024-10-10T17:38:38Z) - How Could Generative AI Support Compliance with the EU AI Act? A Review for Safe Automated Driving Perception [4.075971633195745]
Deep Neural Networks (DNNs) have become central for the perception functions of autonomous vehicles.
The European Union (EU) Artificial Intelligence (AI) Act aims to address these challenges by establishing stringent norms and standards for AI systems.
This review paper summarizes the requirements arising from the EU AI Act regarding DNN-based perception systems and systematically categorizes existing generative AI applications in AD.
arXiv Detail & Related papers (2024-08-30T12:01:06Z) - Testing autonomous vehicles and AI: perspectives and challenges from cybersecurity, transparency, robustness and fairness [53.91018508439669]
The study explores the complexities of integrating Artificial Intelligence into Autonomous Vehicles (AVs)
It examines the challenges introduced by AI components and the impact on testing procedures.
The paper identifies significant challenges and suggests future directions for research and development of AI in AV technology.
arXiv Detail & Related papers (2024-02-21T08:29:42Z) - Nuclear Arms Control Verification and Lessons for AI Treaties [0.0]
Security risks from AI have motivated international agreements that the technology can be used.
The study suggests that the foreseeable case would be reduced to levels that were successfully managed in nuclear arms control.
arXiv Detail & Related papers (2023-04-08T23:05:24Z) - Proceedings of the Artificial Intelligence for Cyber Security (AICS)
Workshop at AAAI 2022 [55.573187938617636]
The workshop will focus on the application of AI to problems in cyber security.
Cyber systems generate large volumes of data, utilizing this effectively is beyond human capabilities.
arXiv Detail & Related papers (2022-02-28T18:27:41Z) - Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable
Claims [59.64274607533249]
AI developers need to make verifiable claims to which they can be held accountable.
This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems.
We analyze ten mechanisms for this purpose--spanning institutions, software, and hardware--and make recommendations aimed at implementing, exploring, or improving those mechanisms.
arXiv Detail & Related papers (2020-04-15T17:15:35Z) - Vulnerabilities of Connectionist AI Applications: Evaluation and Defence [0.0]
This article deals with the IT security of connectionist artificial intelligence (AI) applications, focusing on threats to integrity.
A comprehensive list of threats and possible mitigations is presented by reviewing the state-of-the-art literature.
The discussion of mitigations is likewise not restricted to the level of the AI system itself but rather advocates viewing AI systems in the context of their supply chains.
arXiv Detail & Related papers (2020-03-18T12:33:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.