Safety Co-Option and Compromised National Security: The Self-Fulfilling Prophecy of Weakened AI Risk Thresholds
- URL: http://arxiv.org/abs/2504.15088v1
- Date: Mon, 21 Apr 2025 13:20:56 GMT
- Title: Safety Co-Option and Compromised National Security: The Self-Fulfilling Prophecy of Weakened AI Risk Thresholds
- Authors: Heidy Khlaaf, Sarah Myers West,
- Abstract summary: We show how "safety revisionism" has allowed AI technologists to engage in "safety revisionism"<n>We explore how the current trajectory for AI risk determination and evaluation for foundation model use within national security is poised for a race to the bottom.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Risk thresholds provide a measure of the level of risk exposure that a society or individual is willing to withstand, ultimately shaping how we determine the safety of technological systems. Against the backdrop of the Cold War, the first risk analyses, such as those devised for nuclear systems, cemented societally accepted risk thresholds against which safety-critical and defense systems are now evaluated. But today, the appropriate risk tolerances for AI systems have yet to be agreed on by global governing efforts, despite the need for democratic deliberation regarding the acceptable levels of harm to human life. Absent such AI risk thresholds, AI technologists-primarily industry labs, as well as "AI safety" focused organizations-have instead advocated for risk tolerances skewed by a purported AI arms race and speculative "existential" risks, taking over the arbitration of risk determinations with life-or-death consequences, subverting democratic processes. In this paper, we demonstrate how such approaches have allowed AI technologists to engage in "safety revisionism," substituting traditional safety methods and terminology with ill-defined alternatives that vie for the accelerated adoption of military AI uses at the cost of lowered safety and security thresholds. We explore how the current trajectory for AI risk determination and evaluation for foundation model use within national security is poised for a race to the bottom, to the detriment of the US's national security interests. Safety-critical and defense systems must comply with assurance frameworks that are aligned with established risk thresholds, and foundation models are no exception. As such, development of evaluation frameworks for AI-based military systems must preserve the safety and security of US critical and defense infrastructure, and remain in alignment with international humanitarian law.
Related papers
- Frontier AI's Impact on the Cybersecurity Landscape [42.771086928042315]
This paper presents an in-depth analysis of frontier AI's impact on cybersecurity.<n>We first define and categorize the marginal risks of frontier AI in cybersecurity.<n>We then systemically analyze the current and future impacts of frontier AI in cybersecurity.
arXiv Detail & Related papers (2025-04-07T18:25:18Z) - An Approach to Technical AGI Safety and Security [72.83728459135101]
We develop an approach to address the risk of harms consequential enough to significantly harm humanity.
We focus on technical approaches to misuse and misalignment.
We briefly outline how these ingredients could be combined to produce safety cases for AGI systems.
arXiv Detail & Related papers (2025-04-02T15:59:31Z) - AI threats to national security can be countered through an incident regime [55.2480439325792]
We propose a legally mandated post-deployment AI incident regime that aims to counter potential national security threats from AI systems.<n>Our proposed AI incident regime is split into three phases. The first phase revolves around a novel operationalization of what counts as an 'AI incident'<n>The second and third phases spell out that AI providers should notify a government agency about incidents, and that the government agency should be involved in amending AI providers' security and safety procedures.
arXiv Detail & Related papers (2025-03-25T17:51:50Z) - AI Safety for Everyone [3.440579243843689]
Recent discussions and research in AI safety have increasingly emphasized the deep connection between AI safety and existential risk from advanced AI systems.<n>This framing may exclude researchers and practitioners who are committed to AI safety but approach the field from different angles.<n>We find a vast array of concrete safety work that addresses immediate and practical concerns with current AI systems.
arXiv Detail & Related papers (2025-02-13T13:04:59Z) - AI Safety is Stuck in Technical Terms -- A System Safety Response to the International AI Safety Report [0.0]
Safety has become the central value around which dominant AI governance efforts are being shaped.
The report focuses on the safety risks of general-purpose AI and available technical mitigation approaches.
The system safety discipline has dealt with the safety risks of software-based systems for many decades.
arXiv Detail & Related papers (2025-02-05T22:37:53Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.
Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.
However, the deployment of these agents in physical environments presents significant safety challenges.
This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? [59.96471873997733]
We propose an empirical foundation for developing more meaningful safety metrics and define AI safety in a machine learning research context.
We aim to provide a more rigorous framework for AI safety research, advancing the science of safety evaluations and clarifying the path towards measurable progress.
arXiv Detail & Related papers (2024-07-31T17:59:24Z) - AI Risk Management Should Incorporate Both Safety and Security [185.68738503122114]
We argue that stakeholders in AI risk management should be aware of the nuances, synergies, and interplay between safety and security.
We introduce a unified reference framework to clarify the differences and interplay between AI safety and AI security.
arXiv Detail & Related papers (2024-05-29T21:00:47Z) - Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems [88.80306881112313]
We will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI.
The core feature of these approaches is that they aim to produce AI systems which are equipped with high-assurance quantitative safety guarantees.
We outline a number of approaches for creating each of these three core components, describe the main technical challenges, and suggest a number of potential solutions to them.
arXiv Detail & Related papers (2024-05-10T17:38:32Z) - Affirmative safety: An approach to risk management for high-risk AI [6.133009503054252]
We argue that entities developing or deploying high-risk AI systems should be required to present evidence of affirmative safety.
We propose a risk management approach for advanced AI in which model developers must provide evidence that their activities keep certain risks below regulator-set thresholds.
arXiv Detail & Related papers (2024-04-14T20:48:55Z) - Taking control: Policies to address extinction risks from AI [0.0]
We argue that voluntary commitments from AI companies would be an inappropriate and insufficient response.
We describe three policy proposals that would meaningfully address the threats from advanced AI.
arXiv Detail & Related papers (2023-10-31T15:53:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.