AI Safety: Necessary, but insufficient and possibly problematic
- URL: http://arxiv.org/abs/2403.17419v1
- Date: Tue, 26 Mar 2024 06:18:42 GMT
- Title: AI Safety: Necessary, but insufficient and possibly problematic
- Authors: Deepak P,
- Abstract summary: This article critically examines the recent hype around AI safety.
We consider what 'AI safety' actually means, and outline the dominant concepts that the digital footprint of AI safety aligns with.
We share our concerns on how AI safety may normalize AI that advances structural harm through providing exploitative and harmful AI with a veneer of safety.
- Score: 1.6797508081737678
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This article critically examines the recent hype around AI safety. We first start with noting the nature of the AI safety hype as being dominated by governments and corporations, and contrast it with other avenues within AI research on advancing social good. We consider what 'AI safety' actually means, and outline the dominant concepts that the digital footprint of AI safety aligns with. We posit that AI safety has a nuanced and uneasy relationship with transparency and other allied notions associated with societal good, indicating that it is an insufficient notion if the goal is that of societal good in a broad sense. We note that the AI safety debate has already influenced some regulatory efforts in AI, perhaps in not so desirable directions. We also share our concerns on how AI safety may normalize AI that advances structural harm through providing exploitative and harmful AI with a veneer of safety.
Related papers
- Trustworthy, Responsible, and Safe AI: A Comprehensive Architectural Framework for AI Safety with Challenges and Mitigations [14.150792596344674]
AI Safety is an emerging area of critical importance to the safe adoption and deployment of AI systems.
Our goal is to promote advancement in AI safety research, and ultimately enhance people's trust in digital transformation.
arXiv Detail & Related papers (2024-08-23T09:33:48Z) - Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? [59.96471873997733]
We propose an empirical foundation for developing more meaningful safety metrics and define AI safety in a machine learning research context.
We aim to provide a more rigorous framework for AI safety research, advancing the science of safety evaluations and clarifying the path towards measurable progress.
arXiv Detail & Related papers (2024-07-31T17:59:24Z) - Views on AI aren't binary -- they're plural [0.10241134756773229]
We argue that a simple binary is not an accurate model of AI discourse.
We provide concrete suggestions for how individuals can help avoid the emergence of us-vs-them conflict in the broad community of people working on AI development and governance.
arXiv Detail & Related papers (2023-12-21T17:50:06Z) - Managing extreme AI risks amid rapid progress [171.05448842016125]
We describe risks that include large-scale social harms, malicious uses, and irreversible loss of human control over autonomous AI systems.
There is a lack of consensus about how exactly such risks arise, and how to manage them.
Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems.
arXiv Detail & Related papers (2023-10-26T17:59:06Z) - The Promise and Peril of Artificial Intelligence -- Violet Teaming
Offers a Balanced Path Forward [56.16884466478886]
This paper reviews emerging issues with opaque and uncontrollable AI systems.
It proposes an integrative framework called violet teaming to develop reliable and responsible AI.
It emerged from AI safety research to manage risks proactively by design.
arXiv Detail & Related papers (2023-08-28T02:10:38Z) - Cybertrust: From Explainable to Actionable and Interpretable AI (AI2) [58.981120701284816]
Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations.
It will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making.
arXiv Detail & Related papers (2022-01-26T18:53:09Z) - Trustworthy AI: A Computational Perspective [54.80482955088197]
We focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Non-discrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-Being.
For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems.
arXiv Detail & Related papers (2021-07-12T14:21:46Z) - The Threat of Offensive AI to Organizations [52.011307264694665]
This survey explores the threat of offensive AI on organizations.
First, we discuss how AI changes the adversary's methods, strategies, goals, and overall attack model.
Then, through a literature review, we identify 33 offensive AI capabilities which adversaries can use to enhance their attacks.
arXiv Detail & Related papers (2021-06-30T01:03:28Z) - AI Failures: A Review of Underlying Issues [0.0]
We focus on AI failures on account of flaws in conceptualization, design and deployment.
We find that AI systems fail on account of omission and commission errors in the design of the AI system.
An AI system is quite likely to fail in situations where, in effect, it is called upon to deliver moral judgments.
arXiv Detail & Related papers (2020-07-18T15:31:29Z) - Could regulating the creators deliver trustworthy AI? [2.588973722689844]
AI is becoming all pervasive and is often deployed in everyday technologies, devices and services without our knowledge.
Fear is compounded by the inability to point to a trustworthy source of AI.
Some consider trustworthy AI to be that which complies with relevant laws.
Others point to the requirement to comply with ethics and standards.
arXiv Detail & Related papers (2020-06-26T01:32:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.