Security-First AI: Foundations for Robust and Trustworthy Systems
- URL: http://arxiv.org/abs/2504.16110v1
- Date: Thu, 17 Apr 2025 22:53:01 GMT
- Title: Security-First AI: Foundations for Robust and Trustworthy Systems
- Authors: Krti Tallam,
- Abstract summary: This manuscript posits that AI security must be prioritized as a foundational layer.<n>We argue for a security-first approach to enable trustworthy and resilient AI systems.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The conversation around artificial intelligence (AI) often focuses on safety, transparency, accountability, alignment, and responsibility. However, AI security (i.e., the safeguarding of data, models, and pipelines from adversarial manipulation) underpins all of these efforts. This manuscript posits that AI security must be prioritized as a foundational layer. We present a hierarchical view of AI challenges, distinguishing security from safety, and argue for a security-first approach to enable trustworthy and resilient AI systems. We discuss core threat models, key attack vectors, and emerging defense mechanisms, concluding that a metric-driven approach to AI security is essential for robust AI safety, transparency, and accountability.
Related papers
- A Framework for the Assurance of AI-Enabled Systems [0.0]
This paper proposes a claims-based framework for risk management and assurance of AI systems.
The paper's contributions are a framework process for AI assurance, a set of relevant definitions, and a discussion of important considerations in AI assurance.
arXiv Detail & Related papers (2025-04-03T13:44:01Z) - AI threats to national security can be countered through an incident regime [55.2480439325792]
We propose a legally mandated post-deployment AI incident regime that aims to counter potential national security threats from AI systems.
Our proposed AI incident regime is split into three phases. The first phase revolves around a novel operationalization of what counts as an 'AI incident'
The second and third phases spell out that AI providers should notify a government agency about incidents, and that the government agency should be involved in amending AI providers' security and safety procedures.
arXiv Detail & Related papers (2025-03-25T17:51:50Z) - AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement [73.0700818105842]
We introduce AISafetyLab, a unified framework and toolkit that integrates representative attack, defense, and evaluation methodologies for AI safety.<n> AISafetyLab features an intuitive interface that enables developers to seamlessly apply various techniques.<n>We conduct empirical studies on Vicuna, analyzing different attack and defense strategies to provide valuable insights into their comparative effectiveness.
arXiv Detail & Related papers (2025-02-24T02:11:52Z) - AI Safety for Everyone [3.440579243843689]
Recent discussions and research in AI safety have increasingly emphasized the deep connection between AI safety and existential risk from advanced AI systems.
This framing may exclude researchers and practitioners who are committed to AI safety but approach the field from different angles.
We find a vast array of concrete safety work that addresses immediate and practical concerns with current AI systems.
arXiv Detail & Related papers (2025-02-13T13:04:59Z) - Position: A taxonomy for reporting and describing AI security incidents [57.98317583163334]
We argue that specific are required to describe and report security incidents of AI systems.<n>Existing frameworks for either non-AI security or generic AI safety incident reporting are insufficient to capture the specific properties of AI security.
arXiv Detail & Related papers (2024-12-19T13:50:26Z) - The Game-Theoretic Symbiosis of Trust and AI in Networked Systems [13.343937277604892]
This chapter explores the symbiotic relationship between Artificial Intelligence (AI) and trust in networked systems.
We investigate how trust, when dynamically managed through AI, can form a resilient security ecosystem.
arXiv Detail & Related papers (2024-11-19T21:04:53Z) - Trustworthy, Responsible, and Safe AI: A Comprehensive Architectural Framework for AI Safety with Challenges and Mitigations [15.946242944119385]
AI Safety is an emerging area of critical importance to the safe adoption and deployment of AI systems.<n>Our goal is to promote advancement in AI safety research, and ultimately enhance people's trust in digital transformation.
arXiv Detail & Related papers (2024-08-23T09:33:48Z) - Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? [59.96471873997733]
We propose an empirical foundation for developing more meaningful safety metrics and define AI safety in a machine learning research context.
We aim to provide a more rigorous framework for AI safety research, advancing the science of safety evaluations and clarifying the path towards measurable progress.
arXiv Detail & Related papers (2024-07-31T17:59:24Z) - AI Risk Management Should Incorporate Both Safety and Security [185.68738503122114]
We argue that stakeholders in AI risk management should be aware of the nuances, synergies, and interplay between safety and security.
We introduce a unified reference framework to clarify the differences and interplay between AI safety and AI security.
arXiv Detail & Related papers (2024-05-29T21:00:47Z) - Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems [88.80306881112313]
We will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI.
The core feature of these approaches is that they aim to produce AI systems which are equipped with high-assurance quantitative safety guarantees.
We outline a number of approaches for creating each of these three core components, describe the main technical challenges, and suggest a number of potential solutions to them.
arXiv Detail & Related papers (2024-05-10T17:38:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.