Related papers: Absolutist AI

Absolutist AI

URL: http://arxiv.org/abs/2307.10315v1
Date: Wed, 19 Jul 2023 03:40:37 GMT
Title: Absolutist AI
Authors: Mitchell Barrington
Abstract summary: Training AI systems with absolute constraints may make considerable progress on many AI safety problems. It provides a guardrail for avoiding the very worst outcomes of misalignment. It could prevent AIs from causing catastrophes for the sake of very valuable consequences.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper argues that training AI systems with absolute constraints -- which forbid certain acts irrespective of the amount of value they might produce -- may make considerable progress on many AI safety problems in principle. First, it provides a guardrail for avoiding the very worst outcomes of misalignment. Second, it could prevent AIs from causing catastrophes for the sake of very valuable consequences, such as replacing humans with a much larger number of beings living at a higher welfare level. Third, it makes systems more corrigible, allowing creators to make corrective interventions in them, such as altering their objective functions or shutting them down. And fourth, it helps systems explore their environment more safely by prohibiting them from exploring especially dangerous acts. I offer a decision-theoretic formalization of an absolute constraints, improving on existing models in the literature, and use this model to prove some results about the training and behavior of absolutist AIs. I conclude by showing that, although absolutist AIs will not maximize expected value, they will not be susceptible to behave irrationally, and they will not (contra coherence arguments) face environmental pressure to become expected-value maximizers.

Related papers

Engineering Trustworthy AI: A Developer Guide for Empirical Risk Minimization [53.80919781981027]
Key requirements for trustworthy AI can be translated into design choices for the components of empirical risk minimization. We hope to provide actionable guidance for building AI systems that meet emerging standards for trustworthiness of AI.
arXiv Detail & Related papers (2024-10-25T07:53:32Z)
Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI [67.58673784790375]
We argue that the 'bigger is better' AI paradigm is not only fragile scientifically, but comes with undesirable consequences. First, it is not sustainable, as its compute demands increase faster than model performance, leading to unreasonable economic requirements and a disproportionate environmental footprint. Second, it implies focusing on certain problems at the expense of others, leaving aside important applications, e.g. health, education, or the climate.
arXiv Detail & Related papers (2024-09-21T14:43:54Z)
AI Consciousness and Public Perceptions: Four Futures [0.0]
We investigate whether future human society will broadly believe advanced AI systems to be conscious. We identify four major risks: AI suffering, human disempowerment, geopolitical instability, and human depravity. The paper concludes with the main recommendations to avoid research aimed at intentionally creating conscious AI.
arXiv Detail & Related papers (2024-08-08T22:01:57Z)
AI Safety: A Climb To Armageddon? [0.0]
The paper examines three response strategies: Optimism, Mitigation, and Holism. The surprising robustness of the argument forces a re-examination of core assumptions around AI safety.
arXiv Detail & Related papers (2024-05-30T08:41:54Z)
Managing extreme AI risks amid rapid progress [171.05448842016125]
We describe risks that include large-scale social harms, malicious uses, and irreversible loss of human control over autonomous AI systems. There is a lack of consensus about how exactly such risks arise, and how to manage them. Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems.
arXiv Detail & Related papers (2023-10-26T17:59:06Z)
Fairness in AI and Its Long-Term Implications on Society [68.8204255655161]
We take a closer look at AI fairness and analyze how lack of AI fairness can lead to deepening of biases over time. We discuss how biased models can lead to more negative real-world outcomes for certain groups. If the issues persist, they could be reinforced by interactions with other risks and have severe implications on society in the form of social unrest.
arXiv Detail & Related papers (2023-04-16T11:22:59Z)
Examining the Differential Risk from High-level Artificial Intelligence and the Question of Control [0.0]
The extent and scope of future AI capabilities remain a key uncertainty. There are concerns over the extent of integration and oversight of AI opaque decision processes. This study presents a hierarchical complex systems framework to model AI risk and provide a template for alternative futures analysis.
arXiv Detail & Related papers (2022-11-06T15:46:02Z)
Cybertrust: From Explainable to Actionable and Interpretable AI (AI2) [58.981120701284816]
Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations. It will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making.
arXiv Detail & Related papers (2022-01-26T18:53:09Z)
Trustworthy AI: A Computational Perspective [54.80482955088197]
We focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Non-discrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-Being. For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems.
arXiv Detail & Related papers (2021-07-12T14:21:46Z)
AI Failures: A Review of Underlying Issues [0.0]
We focus on AI failures on account of flaws in conceptualization, design and deployment. We find that AI systems fail on account of omission and commission errors in the design of the AI system. An AI system is quite likely to fail in situations where, in effect, it is called upon to deliver moral judgments.
arXiv Detail & Related papers (2020-07-18T15:31:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.