Related papers: AI Risk Management Should Incorporate Both Safety and Security

AI Risk Management Should Incorporate Both Safety and Security

URL: http://arxiv.org/abs/2405.19524v1
Date: Wed, 29 May 2024 21:00:47 GMT
Title: AI Risk Management Should Incorporate Both Safety and Security
Authors: Xiangyu Qi, Yangsibo Huang, Yi Zeng, Edoardo Debenedetti, Jonas Geiping, Luxi He, Kaixuan Huang, Udari Madhushani, Vikash Sehwag, Weijia Shi, Boyi Wei, Tinghao Xie, Danqi Chen, Pin-Yu Chen, Jeffrey Ding, Ruoxi Jia, Jiaqi Ma, Arvind Narayanan, Weijie J Su, Mengdi Wang, Chaowei Xiao, Bo Li, Dawn Song, Peter Henderson, Prateek Mittal,
Abstract summary: We argue that stakeholders in AI risk management should be aware of the nuances, synergies, and interplay between safety and security. We introduce a unified reference framework to clarify the differences and interplay between AI safety and AI security.
Score: 185.68738503122114
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The exposure of security vulnerabilities in safety-aligned language models, e.g., susceptibility to adversarial attacks, has shed light on the intricate interplay between AI safety and AI security. Although the two disciplines now come together under the overarching goal of AI risk management, they have historically evolved separately, giving rise to differing perspectives. Therefore, in this paper, we advocate that stakeholders in AI risk management should be aware of the nuances, synergies, and interplay between safety and security, and unambiguously take into account the perspectives of both disciplines in order to devise mostly effective and holistic risk mitigation approaches. Unfortunately, this vision is often obfuscated, as the definitions of the basic concepts of "safety" and "security" themselves are often inconsistent and lack consensus across communities. With AI risk management being increasingly cross-disciplinary, this issue is particularly salient. In light of this conceptual challenge, we introduce a unified reference framework to clarify the differences and interplay between AI safety and AI security, aiming to facilitate a shared understanding and effective collaboration across communities.

Related papers

Security-First AI: Foundations for Robust and Trustworthy Systems [0.0]
This manuscript posits that AI security must be prioritized as a foundational layer. We argue for a security-first approach to enable trustworthy and resilient AI systems.
arXiv Detail & Related papers (2025-04-17T22:53:01Z)
An Approach to Technical AGI Safety and Security [72.83728459135101]
We develop an approach to address the risk of harms consequential enough to significantly harm humanity. We focus on technical approaches to misuse and misalignment. We briefly outline how these ingredients could be combined to produce safety cases for AGI systems.
arXiv Detail & Related papers (2025-04-02T15:59:31Z)
The BIG Argument for AI Safety Cases [4.0675753909100445]
The BIG argument adopts a whole-system approach to constructing a safety case for AI systems of varying capability, autonomy and criticality. It is balanced by addressing safety alongside other critical ethical issues such as privacy and equity. It is integrated by bringing together the social, ethical and technical aspects of safety assurance in a way that is traceable and accountable.
arXiv Detail & Related papers (2025-03-12T11:33:28Z)
AI Safety for Everyone [3.440579243843689]
Recent discussions and research in AI safety have increasingly emphasized the deep connection between AI safety and existential risk from advanced AI systems. This framing may exclude researchers and practitioners who are committed to AI safety but approach the field from different angles. We find a vast array of concrete safety work that addresses immediate and practical concerns with current AI systems.
arXiv Detail & Related papers (2025-02-13T13:04:59Z)
AI Safety is Stuck in Technical Terms -- A System Safety Response to the International AI Safety Report [0.0]
Safety has become the central value around which dominant AI governance efforts are being shaped. The report focuses on the safety risks of general-purpose AI and available technical mitigation approaches. The system safety discipline has dealt with the safety risks of software-based systems for many decades.
arXiv Detail & Related papers (2025-02-05T22:37:53Z)
Position: A taxonomy for reporting and describing AI security incidents [57.98317583163334]
We argue that specific are required to describe and report security incidents of AI systems. Existing frameworks for either non-AI security or generic AI safety incident reporting are insufficient to capture the specific properties of AI security.
arXiv Detail & Related papers (2024-12-19T13:50:26Z)
Standardization Trends on Safety and Trustworthiness Technology for Advanced AI [0.0]
Recent AI technologies based on large language models and foundation models are approaching or surpassing artificial general intelligence. These advancements have raised concerns regarding the safety and trustworthiness of advanced AI. Efforts are being expended to develop internationally agreed-upon standards to ensure the safety and reliability of AI.
arXiv Detail & Related papers (2024-10-29T15:50:24Z)
Towards Assuring EU AI Act Compliance and Adversarial Robustness of LLMs [1.368472250332885]
Large language models are prone to misuse and vulnerable to security threats. The European Union's Artificial Intelligence Act seeks to enforce AI robustness in certain contexts.
arXiv Detail & Related papers (2024-10-04T18:38:49Z)
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI [52.138044013005]
generative AI, particularly large language models (LLMs), become increasingly integrated into production applications. New attack surfaces and vulnerabilities emerge and put a focus on adversarial threats in natural language and multi-modal systems. Red-teaming has gained importance in proactively identifying weaknesses in these systems, while blue-teaming works to protect against such adversarial attacks. This work aims to bridge the gap between academic insights and practical security measures for the protection of generative AI systems.
arXiv Detail & Related papers (2024-09-23T10:18:10Z)
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? [59.96471873997733]
We propose an empirical foundation for developing more meaningful safety metrics and define AI safety in a machine learning research context. We aim to provide a more rigorous framework for AI safety research, advancing the science of safety evaluations and clarifying the path towards measurable progress.
arXiv Detail & Related papers (2024-07-31T17:59:24Z)
Cross-Modality Safety Alignment [73.8765529028288]
We introduce a novel safety alignment challenge called Safe Inputs but Unsafe Output (SIUO) to evaluate cross-modality safety alignment. To empirically investigate this problem, we developed the SIUO, a cross-modality benchmark encompassing 9 critical safety domains, such as self-harm, illegal activities, and privacy violations. Our findings reveal substantial safety vulnerabilities in both closed- and open-source LVLMs, underscoring the inadequacy of current models to reliably interpret and respond to complex, real-world scenarios.
arXiv Detail & Related papers (2024-06-21T16:14:15Z)
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems [88.80306881112313]
We will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI. The core feature of these approaches is that they aim to produce AI systems which are equipped with high-assurance quantitative safety guarantees. We outline a number of approaches for creating each of these three core components, describe the main technical challenges, and suggest a number of potential solutions to them.
arXiv Detail & Related papers (2024-05-10T17:38:32Z)
Managing extreme AI risks amid rapid progress [171.05448842016125]
We describe risks that include large-scale social harms, malicious uses, and irreversible loss of human control over autonomous AI systems. There is a lack of consensus about how exactly such risks arise, and how to manage them. Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems.
arXiv Detail & Related papers (2023-10-26T17:59:06Z)
Transdisciplinary AI Observatory -- Retrospective Analyses and Future-Oriented Contradistinctions [22.968817032490996]
This paper motivates the need for an inherently transdisciplinary AI observatory approach. Building on these AI observatory tools, we present near-term transdisciplinary guidelines for AI safety.
arXiv Detail & Related papers (2020-11-26T16:01:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.