Absolutist AI
- URL: http://arxiv.org/abs/2307.10315v1
- Date: Wed, 19 Jul 2023 03:40:37 GMT
- Title: Absolutist AI
- Authors: Mitchell Barrington
- Abstract summary: Training AI systems with absolute constraints may make considerable progress on many AI safety problems.
It provides a guardrail for avoiding the very worst outcomes of misalignment.
It could prevent AIs from causing catastrophes for the sake of very valuable consequences.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper argues that training AI systems with absolute constraints -- which
forbid certain acts irrespective of the amount of value they might produce --
may make considerable progress on many AI safety problems in principle. First,
it provides a guardrail for avoiding the very worst outcomes of misalignment.
Second, it could prevent AIs from causing catastrophes for the sake of very
valuable consequences, such as replacing humans with a much larger number of
beings living at a higher welfare level. Third, it makes systems more
corrigible, allowing creators to make corrective interventions in them, such as
altering their objective functions or shutting them down. And fourth, it helps
systems explore their environment more safely by prohibiting them from
exploring especially dangerous acts. I offer a decision-theoretic formalization
of an absolute constraints, improving on existing models in the literature, and
use this model to prove some results about the training and behavior of
absolutist AIs. I conclude by showing that, although absolutist AIs will not
maximize expected value, they will not be susceptible to behave irrationally,
and they will not (contra coherence arguments) face environmental pressure to
become expected-value maximizers.
Related papers
- AI Safety: A Climb To Armageddon? [0.0]
The paper examines three response strategies: Optimism, Mitigation, and Holism.
The surprising robustness of the argument forces a re-examination of core assumptions around AI safety.
arXiv Detail & Related papers (2024-05-30T08:41:54Z) - Societal Adaptation to Advanced AI [1.2607853680700076]
Existing strategies for managing risks from advanced AI systems often focus on affecting what AI systems are developed and how they diffuse.
We urge a complementary approach: increasing societal adaptation to advanced AI.
We introduce a conceptual framework which helps identify adaptive interventions that avoid, defend against and remedy potentially harmful uses of AI systems.
arXiv Detail & Related papers (2024-05-16T17:52:12Z) - Managing extreme AI risks amid rapid progress [171.05448842016125]
We describe risks that include large-scale social harms, malicious uses, and irreversible loss of human control over autonomous AI systems.
There is a lack of consensus about how exactly such risks arise, and how to manage them.
Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems.
arXiv Detail & Related papers (2023-10-26T17:59:06Z) - Fairness in AI and Its Long-Term Implications on Society [68.8204255655161]
We take a closer look at AI fairness and analyze how lack of AI fairness can lead to deepening of biases over time.
We discuss how biased models can lead to more negative real-world outcomes for certain groups.
If the issues persist, they could be reinforced by interactions with other risks and have severe implications on society in the form of social unrest.
arXiv Detail & Related papers (2023-04-16T11:22:59Z) - AI Maintenance: A Robustness Perspective [91.28724422822003]
We introduce highlighted robustness challenges in the AI lifecycle and motivate AI maintenance by making analogies to car maintenance.
We propose an AI model inspection framework to detect and mitigate robustness risks.
Our proposal for AI maintenance facilitates robustness assessment, status tracking, risk scanning, model hardening, and regulation throughout the AI lifecycle.
arXiv Detail & Related papers (2023-01-08T15:02:38Z) - Examining the Differential Risk from High-level Artificial Intelligence
and the Question of Control [0.0]
The extent and scope of future AI capabilities remain a key uncertainty.
There are concerns over the extent of integration and oversight of AI opaque decision processes.
This study presents a hierarchical complex systems framework to model AI risk and provide a template for alternative futures analysis.
arXiv Detail & Related papers (2022-11-06T15:46:02Z) - Cybertrust: From Explainable to Actionable and Interpretable AI (AI2) [58.981120701284816]
Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations.
It will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making.
arXiv Detail & Related papers (2022-01-26T18:53:09Z) - Trustworthy AI: A Computational Perspective [54.80482955088197]
We focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Non-discrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-Being.
For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems.
arXiv Detail & Related papers (2021-07-12T14:21:46Z) - Building Bridges: Generative Artworks to Explore AI Ethics [56.058588908294446]
In recent years, there has been an increased emphasis on understanding and mitigating adverse impacts of artificial intelligence (AI) technologies on society.
A significant challenge in the design of ethical AI systems is that there are multiple stakeholders in the AI pipeline, each with their own set of constraints and interests.
This position paper outlines some potential ways in which generative artworks can play this role by serving as accessible and powerful educational tools.
arXiv Detail & Related papers (2021-06-25T22:31:55Z) - AI Failures: A Review of Underlying Issues [0.0]
We focus on AI failures on account of flaws in conceptualization, design and deployment.
We find that AI systems fail on account of omission and commission errors in the design of the AI system.
An AI system is quite likely to fail in situations where, in effect, it is called upon to deliver moral judgments.
arXiv Detail & Related papers (2020-07-18T15:31:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.