Coordinated Disclosure of Dual-Use Capabilities: An Early Warning System for Advanced AI
- URL: http://arxiv.org/abs/2407.01420v3
- Date: Fri, 04 Oct 2024 19:06:02 GMT
- Title: Coordinated Disclosure of Dual-Use Capabilities: An Early Warning System for Advanced AI
- Authors: Joe O'Brien, Shaun Ee, Jam Kraprayoon, Bill Anderson-Samways, Oscar Delaney, Zoe Williams,
- Abstract summary: We propose Coordinated Disclosure of Dual-Use Capabilities (CDDC) as a process to guide early information-sharing between advanced AI developers, U.S. government agencies, and other private sector actors.
This aims to provide the U.S. government, dual-use foundation model developers, and other actors with an overview of AI capabilities that could significantly impact public safety and security, as well as maximal time to respond.
- Score: 0.0
- License:
- Abstract: Advanced AI systems may be developed which exhibit capabilities that present significant risks to public safety or security. They may also exhibit capabilities that may be applied defensively in a wide set of domains, including (but not limited to) developing societal resilience against AI threats. We propose Coordinated Disclosure of Dual-Use Capabilities (CDDC) as a process to guide early information-sharing between advanced AI developers, US government agencies, and other private sector actors about these capabilities. The process centers around an information clearinghouse (the "coordinator") which receives evidence of dual-use capabilities from finders via mandatory and/or voluntary reporting pathways, and passes noteworthy reports to defenders for follow-up (i.e., further analysis and response). This aims to provide the US government, dual-use foundation model developers, and other actors with an overview of AI capabilities that could significantly impact public safety and security, as well as maximal time to respond.
Related papers
- Open Problems in Machine Unlearning for AI Safety [61.43515658834902]
Machine unlearning -- the ability to selectively forget or suppress specific types of knowledge -- has shown promise for privacy and data removal tasks.
In this paper, we identify key limitations that prevent unlearning from serving as a comprehensive solution for AI safety.
arXiv Detail & Related papers (2025-01-09T03:59:10Z) - Considerations Influencing Offense-Defense Dynamics From Artificial Intelligence [0.0]
AI can enhance defensive capabilities but also presents avenues for malicious exploitation and large-scale societal harm.
This paper proposes a taxonomy to map and examine the key factors that influence whether AI systems predominantly pose threats or offer protective benefits to society.
arXiv Detail & Related papers (2024-12-05T10:05:53Z) - Security Threats in Agentic AI System [0.0]
The complexity of AI systems combined with their ability to process and analyze large volumes of data increases the chances of data leaks or breaches.
As AI agents evolve with greater autonomy, their capacity to bypass or exploit security measures becomes a growing concern.
arXiv Detail & Related papers (2024-10-16T06:40:02Z) - AI Emergency Preparedness: Examining the federal government's ability to detect and respond to AI-related national security threats [0.2008854179910039]
Emergency preparedness can improve the government's ability to monitor and predict AI progress.
We focus on three plausible risk scenarios: (1) loss of control (threats from a powerful AI system that becomes capable of escaping human control), (2) cybersecurity threats from malicious actors, and (3) biological weapons proliferation.
arXiv Detail & Related papers (2024-07-03T17:54:01Z) - A Safe Harbor for AI Evaluation and Red Teaming [124.89885800509505]
Some researchers fear that conducting such research or releasing their findings will result in account suspensions or legal reprisal.
We propose that major AI developers commit to providing a legal and technical safe harbor.
We believe these commitments are a necessary step towards more inclusive and unimpeded community efforts to tackle the risks of generative AI.
arXiv Detail & Related papers (2024-03-07T20:55:08Z) - Managing extreme AI risks amid rapid progress [171.05448842016125]
We describe risks that include large-scale social harms, malicious uses, and irreversible loss of human control over autonomous AI systems.
There is a lack of consensus about how exactly such risks arise, and how to manage them.
Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems.
arXiv Detail & Related papers (2023-10-26T17:59:06Z) - AI Potentiality and Awareness: A Position Paper from the Perspective of
Human-AI Teaming in Cybersecurity [18.324118502535775]
We argue that human-AI teaming is worthwhile in cybersecurity.
We emphasize the importance of a balanced approach that incorporates AI's computational power with human expertise.
arXiv Detail & Related papers (2023-09-28T01:20:44Z) - Cybertrust: From Explainable to Actionable and Interpretable AI (AI2) [58.981120701284816]
Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations.
It will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making.
arXiv Detail & Related papers (2022-01-26T18:53:09Z) - Trustworthy AI Inference Systems: An Industry Research View [58.000323504158054]
We provide an industry research view for approaching the design, deployment, and operation of trustworthy AI inference systems.
We highlight opportunities and challenges in AI systems using trusted execution environments.
We outline areas of further development that require the global collective attention of industry, academia, and government researchers.
arXiv Detail & Related papers (2020-08-10T23:05:55Z) - Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable
Claims [59.64274607533249]
AI developers need to make verifiable claims to which they can be held accountable.
This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems.
We analyze ten mechanisms for this purpose--spanning institutions, software, and hardware--and make recommendations aimed at implementing, exploring, or improving those mechanisms.
arXiv Detail & Related papers (2020-04-15T17:15:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.