AI Safety for Everyone
- URL: http://arxiv.org/abs/2502.09288v2
- Date: Fri, 14 Feb 2025 16:11:22 GMT
- Title: AI Safety for Everyone
- Authors: Balint Gyevnar, Atoosa Kasirzadeh,
- Abstract summary: Recent discussions and research in AI safety have increasingly emphasized the deep connection between AI safety and existential risk from advanced AI systems.
This framing may exclude researchers and practitioners who are committed to AI safety but approach the field from different angles.
We find a vast array of concrete safety work that addresses immediate and practical concerns with current AI systems.
- Score: 3.440579243843689
- License:
- Abstract: Recent discussions and research in AI safety have increasingly emphasized the deep connection between AI safety and existential risk from advanced AI systems, suggesting that work on AI safety necessarily entails serious consideration of potential existential threats. However, this framing has three potential drawbacks: it may exclude researchers and practitioners who are committed to AI safety but approach the field from different angles; it could lead the public to mistakenly view AI safety as focused solely on existential scenarios rather than addressing a wide spectrum of safety challenges; and it risks creating resistance to safety measures among those who disagree with predictions of existential AI risks. Through a systematic literature review of primarily peer-reviewed research, we find a vast array of concrete safety work that addresses immediate and practical concerns with current AI systems. This includes crucial areas like adversarial robustness and interpretability, highlighting how AI safety research naturally extends existing technological and systems safety concerns and practices. Our findings suggest the need for an epistemically inclusive and pluralistic conception of AI safety that can accommodate the full range of safety considerations, motivations, and perspectives that currently shape the field.
Related papers
- Open Problems in Machine Unlearning for AI Safety [61.43515658834902]
Machine unlearning -- the ability to selectively forget or suppress specific types of knowledge -- has shown promise for privacy and data removal tasks.
In this paper, we identify key limitations that prevent unlearning from serving as a comprehensive solution for AI safety.
arXiv Detail & Related papers (2025-01-09T03:59:10Z) - Landscape of AI safety concerns -- A methodology to support safety assurance for AI-based autonomous systems [0.0]
AI has emerged as a key technology, driving advancements across a range of applications.
The challenge of assuring safety in systems that incorporate AI components is substantial.
We propose a novel methodology designed to support the creation of safety assurance cases for AI-based systems.
arXiv Detail & Related papers (2024-12-18T16:38:16Z) - A Trilogy of AI Safety Frameworks: Paths from Facts and Knowledge Gaps to Reliable Predictions and New Knowledge [0.0]
AI Safety has become a vital front-line concern of many scientists within and outside the AI community.
There are many immediate and long term anticipated risks that range from existential risk to human existence to deep fakes and bias in machine learning systems.
arXiv Detail & Related papers (2024-10-09T14:43:06Z) - Trustworthy, Responsible, and Safe AI: A Comprehensive Architectural Framework for AI Safety with Challenges and Mitigations [15.946242944119385]
AI Safety is an emerging area of critical importance to the safe adoption and deployment of AI systems.
Our goal is to promote advancement in AI safety research, and ultimately enhance people's trust in digital transformation.
arXiv Detail & Related papers (2024-08-23T09:33:48Z) - Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? [59.96471873997733]
We propose an empirical foundation for developing more meaningful safety metrics and define AI safety in a machine learning research context.
We aim to provide a more rigorous framework for AI safety research, advancing the science of safety evaluations and clarifying the path towards measurable progress.
arXiv Detail & Related papers (2024-07-31T17:59:24Z) - AI Risk Management Should Incorporate Both Safety and Security [185.68738503122114]
We argue that stakeholders in AI risk management should be aware of the nuances, synergies, and interplay between safety and security.
We introduce a unified reference framework to clarify the differences and interplay between AI safety and AI security.
arXiv Detail & Related papers (2024-05-29T21:00:47Z) - Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems [88.80306881112313]
We will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI.
The core feature of these approaches is that they aim to produce AI systems which are equipped with high-assurance quantitative safety guarantees.
We outline a number of approaches for creating each of these three core components, describe the main technical challenges, and suggest a number of potential solutions to them.
arXiv Detail & Related papers (2024-05-10T17:38:32Z) - Managing extreme AI risks amid rapid progress [171.05448842016125]
We describe risks that include large-scale social harms, malicious uses, and irreversible loss of human control over autonomous AI systems.
There is a lack of consensus about how exactly such risks arise, and how to manage them.
Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems.
arXiv Detail & Related papers (2023-10-26T17:59:06Z) - Towards Safer Generative Language Models: A Survey on Safety Risks,
Evaluations, and Improvements [76.80453043969209]
This survey presents a framework for safety research pertaining to large models.
We begin by introducing safety issues of wide concern, then delve into safety evaluation methods for large models.
We explore the strategies for enhancing large model safety from training to deployment.
arXiv Detail & Related papers (2023-02-18T09:32:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.