AI Risk Management Should Incorporate Both Safety and Security
- URL: http://arxiv.org/abs/2405.19524v1
- Date: Wed, 29 May 2024 21:00:47 GMT
- Title: AI Risk Management Should Incorporate Both Safety and Security
- Authors: Xiangyu Qi, Yangsibo Huang, Yi Zeng, Edoardo Debenedetti, Jonas Geiping, Luxi He, Kaixuan Huang, Udari Madhushani, Vikash Sehwag, Weijia Shi, Boyi Wei, Tinghao Xie, Danqi Chen, Pin-Yu Chen, Jeffrey Ding, Ruoxi Jia, Jiaqi Ma, Arvind Narayanan, Weijie J Su, Mengdi Wang, Chaowei Xiao, Bo Li, Dawn Song, Peter Henderson, Prateek Mittal,
- Abstract summary: We argue that stakeholders in AI risk management should be aware of the nuances, synergies, and interplay between safety and security.
We introduce a unified reference framework to clarify the differences and interplay between AI safety and AI security.
- Score: 185.68738503122114
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The exposure of security vulnerabilities in safety-aligned language models, e.g., susceptibility to adversarial attacks, has shed light on the intricate interplay between AI safety and AI security. Although the two disciplines now come together under the overarching goal of AI risk management, they have historically evolved separately, giving rise to differing perspectives. Therefore, in this paper, we advocate that stakeholders in AI risk management should be aware of the nuances, synergies, and interplay between safety and security, and unambiguously take into account the perspectives of both disciplines in order to devise mostly effective and holistic risk mitigation approaches. Unfortunately, this vision is often obfuscated, as the definitions of the basic concepts of "safety" and "security" themselves are often inconsistent and lack consensus across communities. With AI risk management being increasingly cross-disciplinary, this issue is particularly salient. In light of this conceptual challenge, we introduce a unified reference framework to clarify the differences and interplay between AI safety and AI security, aiming to facilitate a shared understanding and effective collaboration across communities.
Related papers
- AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies [88.32153122712478]
We identify 314 unique risk categories organized into a four-tiered taxonomy.
At the highest level, this taxonomy encompasses System & Operational Risks, Content Safety Risks, Societal Risks, and Legal & Rights Risks.
We aim to advance AI safety through information sharing across sectors and the promotion of best practices in risk mitigation for generative AI models and systems.
arXiv Detail & Related papers (2024-06-25T18:13:05Z) - Cross-Modality Safety Alignment [73.8765529028288]
We introduce a novel safety alignment challenge called Safe Inputs but Unsafe Output (SIUO) to evaluate cross-modality safety alignment.
To empirically investigate this problem, we developed the SIUO, a cross-modality benchmark encompassing 9 critical safety domains, such as self-harm, illegal activities, and privacy violations.
Our findings reveal substantial safety vulnerabilities in both closed- and open-source LVLMs, underscoring the inadequacy of current models to reliably interpret and respond to complex, real-world scenarios.
arXiv Detail & Related papers (2024-06-21T16:14:15Z) - AI Safety: A Climb To Armageddon? [0.0]
The paper examines three response strategies: Optimism, Mitigation, and Holism.
The surprising robustness of the argument forces a re-examination of core assumptions around AI safety.
arXiv Detail & Related papers (2024-05-30T08:41:54Z) - Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems [88.80306881112313]
We will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI.
The core feature of these approaches is that they aim to produce AI systems which are equipped with high-assurance quantitative safety guarantees.
We outline a number of approaches for creating each of these three core components, describe the main technical challenges, and suggest a number of potential solutions to them.
arXiv Detail & Related papers (2024-05-10T17:38:32Z) - Managing extreme AI risks amid rapid progress [171.05448842016125]
We describe risks that include large-scale social harms, malicious uses, and irreversible loss of human control over autonomous AI systems.
There is a lack of consensus about how exactly such risks arise, and how to manage them.
Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems.
arXiv Detail & Related papers (2023-10-26T17:59:06Z) - Dual Governance: The intersection of centralized regulation and
crowdsourced safety mechanisms for Generative AI [1.2691047660244335]
Generative Artificial Intelligence (AI) has seen mainstream adoption lately, especially in the form of consumer-facing, open-ended, text and image generating models.
The potential for generative AI to displace human creativity and livelihoods has also been under intense scrutiny.
Existing and proposed centralized regulations by governments to rein in AI face criticisms such as not having sufficient clarity or uniformity.
Decentralized protections via crowdsourced safety tools and mechanisms are a potential alternative.
arXiv Detail & Related papers (2023-08-02T23:25:21Z) - A System for Automated Open-Source Threat Intelligence Gathering and
Management [53.65687495231605]
SecurityKG is a system for automated OSCTI gathering and management.
It uses a combination of AI and NLP techniques to extract high-fidelity knowledge about threat behaviors.
arXiv Detail & Related papers (2021-01-19T18:31:35Z) - Transdisciplinary AI Observatory -- Retrospective Analyses and
Future-Oriented Contradistinctions [22.968817032490996]
This paper motivates the need for an inherently transdisciplinary AI observatory approach.
Building on these AI observatory tools, we present near-term transdisciplinary guidelines for AI safety.
arXiv Detail & Related papers (2020-11-26T16:01:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.