Verification methods for international AI agreements
- URL: http://arxiv.org/abs/2408.16074v2
- Date: Mon, 4 Nov 2024 20:46:07 GMT
- Title: Verification methods for international AI agreements
- Authors: Akash R. Wasil, Tom Reed, Jack William Miller, Peter Barnett,
- Abstract summary: We examine 10 verification methods that could detect two types of potential violations.
For each verification method, we provide a description, historical precedents, and possible evasion techniques.
- Score: 1.6874375111244329
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: What techniques can be used to verify compliance with international agreements about advanced AI development? In this paper, we examine 10 verification methods that could detect two types of potential violations: unauthorized AI training (e.g., training runs above a certain FLOP threshold) and unauthorized data centers. We divide the verification methods into three categories: (a) national technical means (methods requiring minimal or no access from suspected non-compliant nations), (b) access-dependent methods (methods that require approval from the nation suspected of unauthorized activities), and (c) hardware-dependent methods (methods that require rules around advanced hardware). For each verification method, we provide a description, historical precedents, and possible evasion techniques. We conclude by offering recommendations for future work related to the verification and enforcement of international AI governance agreements.
Related papers
- Verifying International Agreements on AI: Six Layers of Verification for Rules on Large-Scale AI Development and Deployment [0.7364983833280243]
This report provides an in-depth overview of AI verification, intended for both policy professionals and technical researchers.<n>We present novel conceptual frameworks, detailed implementation options, and key R&D challenges.<n>We find that states could eventually verify compliance by using six largely independent verification approaches.
arXiv Detail & Related papers (2025-07-21T17:45:15Z) - Mechanisms to Verify International Agreements About AI Development [0.0]
Report aims to demonstrate how countries could practically verify claims about each other's AI development and deployment.<n>The focus is on international agreements and state-involved AI development, but these approaches could also be applied to domestic regulation of companies.
arXiv Detail & Related papers (2025-06-18T20:28:54Z) - Explainable AI Systems Must Be Contestable: Here's How to Make It Happen [2.5875936082584623]
This paper presents the first rigorous formal definition of contestability in explainable AI.<n>We introduce a modular framework of by-design and post-hoc mechanisms spanning human-centered interfaces, technical processes, and organizational architectures.<n>Our work equips practitioners with the tools to embed genuine recourse and accountability into AI systems.
arXiv Detail & Related papers (2025-06-02T13:32:05Z) - Mapping the Regulatory Learning Space for the EU AI Act [0.8987776881291145]
The EU's AI Act represents the world first transnational AI regulation with concrete enforcement measures.
It builds upon existing EU mechanisms for product health and safety regulation, but extends it to protect fundamental rights.
These extensions introduce uncertainties in terms of how the technical state of the art will be applied to AI system certification and enforcement actions.
We argue that these uncertainties, coupled with the fast changing nature of AI and the relative immaturity of the state of the art in fundamental rights risk management require the implementation of the AI Act to place a strong emphasis on comprehensive and rapid regulatory learning.
arXiv Detail & Related papers (2025-02-27T12:46:30Z) - Practical Application and Limitations of AI Certification Catalogues in the Light of the AI Act [0.1433758865948252]
This work focuses on the practical application and limitations of existing certification catalogues in the light of the AI Act.
We use the AI Assessment Catalogue as a comprehensive tool to systematically assess an AI model's compliance with certification standards.
We observe the limitations of an AI system that has no active development team anymore and highlight the importance of complete system documentation.
arXiv Detail & Related papers (2025-01-20T15:54:57Z) - Subversion Strategy Eval: Can language models statelessly strategize to subvert control protocols? [50.62012690460685]
This paper investigates how well AI systems can generate and act on their own strategies for subverting control protocols.
An AI system may need to reliably generate optimal plans in each context, take actions with well-calibrated probabilities, and coordinate plans with other instances of itself without communicating.
arXiv Detail & Related papers (2024-12-17T02:33:45Z) - Using AI Alignment Theory to understand the potential pitfalls of regulatory frameworks [55.2480439325792]
This paper critically examines the European Union's Artificial Intelligence Act (EU AI Act)
Uses insights from Alignment Theory (AT) research, which focuses on the potential pitfalls of technical alignment in Artificial Intelligence.
As we apply these concepts to the EU AI Act, we uncover potential vulnerabilities and areas for improvement in the regulation.
arXiv Detail & Related papers (2024-10-10T17:38:38Z) - An FDA for AI? Pitfalls and Plausibility of Approval Regulation for Frontier Artificial Intelligence [0.0]
We explore the applicability of approval regulation -- that is, regulation of a product that combines experimental minima with government licensure conditioned partially or fully upon that experimentation -- to the regulation of frontier AI.
There are a number of reasons to believe that approval regulation, simplistically applied, would be inapposite for frontier AI risks.
We conclude by highlighting the role of policy learning and experimentation in regulatory development.
arXiv Detail & Related papers (2024-08-01T17:54:57Z) - An evidence-based methodology for human rights impact assessment (HRIA) in the development of AI data-intensive systems [49.1574468325115]
We show that human rights already underpin the decisions in the field of data use.
This work presents a methodology and a model for a Human Rights Impact Assessment (HRIA)
The proposed methodology is tested in concrete case-studies to prove its feasibility and effectiveness.
arXiv Detail & Related papers (2024-07-30T16:27:52Z) - Open Problems in Technical AI Governance [93.89102632003996]
Technical AI governance refers to technical analysis and tools for supporting the effective governance of AI.
This paper is intended as a resource for technical researchers or research funders looking to contribute to AI governance.
arXiv Detail & Related papers (2024-07-20T21:13:56Z) - Combining AI Control Systems and Human Decision Support via Robustness and Criticality [53.10194953873209]
We extend a methodology for adversarial explanations (AE) to state-of-the-art reinforcement learning frameworks.
We show that the learned AI control system demonstrates robustness against adversarial tampering.
In a training / learning framework, this technology can improve both the AI's decisions and explanations through human interaction.
arXiv Detail & Related papers (2024-07-03T15:38:57Z) - Position Paper: Technical Research and Talent is Needed for Effective AI Governance [0.0]
We survey policy documents published by public-sector institutions in the EU, US, and China.
We highlight specific areas of disconnect between the technical requirements necessary for enacting proposed policy actions, and the current technical state of the art.
Our analysis motivates a call for tighter integration of the AI/ML research community within AI governance.
arXiv Detail & Related papers (2024-06-11T06:32:28Z) - How VADER is your AI? Towards a definition of artificial intelligence
systems appropriate for regulation [41.94295877935867]
Recent AI regulation proposals adopt AI definitions affecting ICT techniques, approaches, and systems that are not AI.
We propose a framework to score how validated as appropriately-defined for regulation (VADER) an AI definition is.
arXiv Detail & Related papers (2024-02-07T17:41:15Z) - Learning Transferable Conceptual Prototypes for Interpretable
Unsupervised Domain Adaptation [79.22678026708134]
In this paper, we propose an inherently interpretable method, named Transferable Prototype Learning ( TCPL)
To achieve this goal, we design a hierarchically prototypical module that transfers categorical basic concepts from the source domain to the target domain and learns domain-shared prototypes for explaining the underlying reasoning process.
Comprehensive experiments show that the proposed method can not only provide effective and intuitive explanations but also outperform previous state-of-the-arts.
arXiv Detail & Related papers (2023-10-12T06:36:41Z) - Nuclear Arms Control Verification and Lessons for AI Treaties [0.0]
Security risks from AI have motivated international agreements that the technology can be used.
The study suggests that the foreseeable case would be reduced to levels that were successfully managed in nuclear arms control.
arXiv Detail & Related papers (2023-04-08T23:05:24Z) - Correct-by-Construction Runtime Enforcement in AI -- A Survey [3.509295509987626]
Enforcement refers to the theories, techniques, and tools for enforcing correct behavior with respect to a formal specification of systems at runtime.
We discuss how safety is traditionally handled in the field of AI and how more formal guarantees on the safety of a self-learning agent can be given by integrating a runtime enforcer.
arXiv Detail & Related papers (2022-08-30T17:45:38Z) - A Methodology for Creating AI FactSheets [67.65802440158753]
This paper describes a methodology for creating the form of AI documentation we call FactSheets.
Within each step of the methodology, we describe the issues to consider and the questions to explore.
This methodology will accelerate the broader adoption of transparent AI documentation.
arXiv Detail & Related papers (2020-06-24T15:08:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.