Adapting cybersecurity frameworks to manage frontier AI risks: A defense-in-depth approach
- URL: http://arxiv.org/abs/2408.07933v1
- Date: Thu, 15 Aug 2024 05:06:03 GMT
- Title: Adapting cybersecurity frameworks to manage frontier AI risks: A defense-in-depth approach
- Authors: Shaun Ee, Joe O'Brien, Zoe Williams, Amanda El-Dakhakhni, Michael Aird, Alex Lintz,
- Abstract summary: We outline three approaches that can help identify gaps in the management of AI-related risks.
First, a functional approach identifies essential categories of activities that a risk management approach should cover.
Second, a lifecycle approach assigns safety and security activities across the model development lifecycle.
Third, a threat-based approach identifies tactics, techniques, and procedures used by malicious actors.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The complex and evolving threat landscape of frontier AI development requires a multi-layered approach to risk management ("defense-in-depth"). By reviewing cybersecurity and AI frameworks, we outline three approaches that can help identify gaps in the management of AI-related risks. First, a functional approach identifies essential categories of activities ("functions") that a risk management approach should cover, as in the NIST Cybersecurity Framework (CSF) and AI Risk Management Framework (AI RMF). Second, a lifecycle approach instead assigns safety and security activities across the model development lifecycle, as in DevSecOps and the OECD AI lifecycle framework. Third, a threat-based approach identifies tactics, techniques, and procedures (TTPs) used by malicious actors, as in the MITRE ATT&CK and MITRE ATLAS databases. We recommend that frontier AI developers and policymakers begin by adopting the functional approach, given the existence of the NIST AI RMF and other supplementary guides, but also establish a detailed frontier AI lifecycle model and threat-based TTP databases for future use.
Related papers
- SoK: Unifying Cybersecurity and Cybersafety of Multimodal Foundation Models with an Information Theory Approach [58.93030774141753]
Multimodal foundation models (MFMs) represent a significant advancement in artificial intelligence.
This paper conceptualizes cybersafety and cybersecurity in the context of multimodal learning.
We present a comprehensive Systematization of Knowledge (SoK) to unify these concepts in MFMs, identifying key threats.
arXiv Detail & Related papers (2024-11-17T23:06:20Z) - Risk Sources and Risk Management Measures in Support of Standards for General-Purpose AI Systems [2.3266896180922187]
We compile an extensive catalog of risk sources and risk management measures for general-purpose AI systems.
This work involves identifying technical, operational, and societal risks across model development, training, and deployment stages.
The catalog is released under a public domain license for ease of direct use by stakeholders in AI governance and standards.
arXiv Detail & Related papers (2024-10-30T21:32:56Z) - A Formal Framework for Assessing and Mitigating Emergent Security Risks in Generative AI Models: Bridging Theory and Dynamic Risk Mitigation [0.3413711585591077]
As generative AI systems, including large language models (LLMs) and diffusion models, advance rapidly, their growing adoption has led to new and complex security risks.
This paper introduces a novel formal framework for categorizing and mitigating these emergent security risks.
We identify previously under-explored risks, including latent space exploitation, multi-modal cross-attack vectors, and feedback-loop-induced model degradation.
arXiv Detail & Related papers (2024-10-15T02:51:32Z) - Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI [52.138044013005]
generative AI, particularly large language models (LLMs), become increasingly integrated into production applications.
New attack surfaces and vulnerabilities emerge and put a focus on adversarial threats in natural language and multi-modal systems.
Red-teaming has gained importance in proactively identifying weaknesses in these systems, while blue-teaming works to protect against such adversarial attacks.
This work aims to bridge the gap between academic insights and practical security measures for the protection of generative AI systems.
arXiv Detail & Related papers (2024-09-23T10:18:10Z) - EAIRiskBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [47.69642609574771]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.
Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.
However, the deployment of these agents in physical environments presents significant safety challenges.
This study introduces EAIRiskBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - AI Risk Management Should Incorporate Both Safety and Security [185.68738503122114]
We argue that stakeholders in AI risk management should be aware of the nuances, synergies, and interplay between safety and security.
We introduce a unified reference framework to clarify the differences and interplay between AI safety and AI security.
arXiv Detail & Related papers (2024-05-29T21:00:47Z) - Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems [88.80306881112313]
We will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI.
The core feature of these approaches is that they aim to produce AI systems which are equipped with high-assurance quantitative safety guarantees.
We outline a number of approaches for creating each of these three core components, describe the main technical challenges, and suggest a number of potential solutions to them.
arXiv Detail & Related papers (2024-05-10T17:38:32Z) - Application of the NIST AI Risk Management Framework to Surveillance Technology [1.5442389863546546]
This study offers an in-depth analysis of the application and implications of the National Institute of Standards and Technology's AI Risk Management Framework (NIST AI RMF)
Given the inherently high-risk and consequential nature of facial recognition systems, our research emphasizes the critical need for a structured approach to risk management in this sector.
arXiv Detail & Related papers (2024-03-22T23:07:11Z) - Asset-centric Threat Modeling for AI-based Systems [7.696807063718328]
This paper presents ThreatFinderAI, an approach and tool to model AI-related assets, threats, countermeasures, and quantify residual risks.
To evaluate the practicality of the approach, participants were tasked to recreate a threat model developed by cybersecurity experts of an AI-based healthcare platform.
Overall, the solution's usability was well-perceived and effectively supports threat identification and risk discussion.
arXiv Detail & Related papers (2024-03-11T08:40:01Z) - AI Hazard Management: A framework for the systematic management of root
causes for AI risks [0.0]
This paper introduces the AI Hazard Management (AIHM) framework.
It provides a structured process to systematically identify, assess, and treat AI hazards.
It builds upon an AI hazard list from a comprehensive state-of-the-art analysis.
arXiv Detail & Related papers (2023-10-25T15:55:50Z) - Towards Safer Generative Language Models: A Survey on Safety Risks,
Evaluations, and Improvements [76.80453043969209]
This survey presents a framework for safety research pertaining to large models.
We begin by introducing safety issues of wide concern, then delve into safety evaluation methods for large models.
We explore the strategies for enhancing large model safety from training to deployment.
arXiv Detail & Related papers (2023-02-18T09:32:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.