Related papers: On the Ethics of Building AI in a Responsible Manner

On the Ethics of Building AI in a Responsible Manner

URL: http://arxiv.org/abs/2004.04644v1
Date: Mon, 30 Mar 2020 04:11:08 GMT
Title: On the Ethics of Building AI in a Responsible Manner
Authors: Shai Shalev-Shwartz, Shaked Shammah, Amnon Shashua
Abstract summary: We argue that a formalism of AI alignment that does not distinguish between strategic and misalignments is not useful. We propose a definition of a strategic-AI-alignment and prove that most machine learning algorithms that are being used in practice today do not suffer from the strategic-AI-alignment problem.
Score: 22.792375902000614
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The AI-alignment problem arises when there is a discrepancy between the goals that a human designer specifies to an AI learner and a potential catastrophic outcome that does not reflect what the human designer really wants. We argue that a formalism of AI alignment that does not distinguish between strategic and agnostic misalignments is not useful, as it deems all technology as un-safe. We propose a definition of a strategic-AI-alignment and prove that most machine learning algorithms that are being used in practice today do not suffer from the strategic-AI-alignment problem. However, without being careful, today's technology might lead to strategic misalignment.

Related papers

Alignment, Agency and Autonomy in Frontier AI: A Systems Engineering Perspective [0.0]
Concepts of alignment, agency, and autonomy have become central to AI safety, governance, and control. This paper traces the historical, philosophical, and technical evolution of these concepts, emphasizing how their definitions influence AI development, deployment, and oversight.
arXiv Detail & Related papers (2025-02-20T21:37:20Z)
Imagining and building wise machines: The centrality of AI metacognition [78.76893632793497]
We argue that shortcomings stem from one overarching failure: AI systems lack wisdom. While AI research has focused on task-level strategies, metacognition is underdeveloped in AI systems. We propose that integrating metacognitive capabilities into AI systems is crucial for enhancing their robustness, explainability, cooperation, and safety.
arXiv Detail & Related papers (2024-11-04T18:10:10Z)
Using AI Alignment Theory to understand the potential pitfalls of regulatory frameworks [55.2480439325792]
This paper critically examines the European Union's Artificial Intelligence Act (EU AI Act) Uses insights from Alignment Theory (AT) research, which focuses on the potential pitfalls of technical alignment in Artificial Intelligence. As we apply these concepts to the EU AI Act, we uncover potential vulnerabilities and areas for improvement in the regulation.
arXiv Detail & Related papers (2024-10-10T17:38:38Z)
Rolling in the deep of cognitive and AI biases [1.556153237434314]
We argue that there is urgent need to understand AI as a sociotechnical system, inseparable from the conditions in which it is designed, developed and deployed. We address this critical issue by following a radical new methodology under which human cognitive biases become core entities in our AI fairness overview. We introduce a new mapping, which justifies the humans to AI biases and we detect relevant fairness intensities and inter-dependencies.
arXiv Detail & Related papers (2024-07-30T21:34:04Z)
Combining AI Control Systems and Human Decision Support via Robustness and Criticality [53.10194953873209]
We extend a methodology for adversarial explanations (AE) to state-of-the-art reinforcement learning frameworks. We show that the learned AI control system demonstrates robustness against adversarial tampering. In a training / learning framework, this technology can improve both the AI's decisions and explanations through human interaction.
arXiv Detail & Related papers (2024-07-03T15:38:57Z)
The AI Alignment Paradox [10.674155943520729]
The better we align AI models with our values, the easier we may make it for adversaries to misalign the models. With AI's increasing real-world impact, it is imperative that a broad community of researchers be aware of the AI alignment paradox.
arXiv Detail & Related papers (2024-05-31T14:06:24Z)
Learning to Make Adherence-Aware Advice [8.419688203654948]
This paper presents a sequential decision-making model that takes into account the human's adherence level. We provide learning algorithms that learn the optimal advice policy and make advice only at critical time stamps.
arXiv Detail & Related papers (2023-10-01T23:15:55Z)
Seamful XAI: Operationalizing Seamful Design in Explainable AI [59.89011292395202]
Mistakes in AI systems are inevitable, arising from both technical limitations and sociotechnical gaps. We propose that seamful design can foster AI explainability by revealing sociotechnical and infrastructural mismatches. We explore this process with 43 AI practitioners and real end-users.
arXiv Detail & Related papers (2022-11-12T21:54:05Z)
A User-Centred Framework for Explainable Artificial Intelligence in Human-Robot Interaction [70.11080854486953]
We propose a user-centred framework for XAI that focuses on its social-interactive aspect. The framework aims to provide a structure for interactive XAI solutions thought for non-expert users.
arXiv Detail & Related papers (2021-09-27T09:56:23Z)
Trustworthy AI: A Computational Perspective [54.80482955088197]
We focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Non-discrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-Being. For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems.
arXiv Detail & Related papers (2021-07-12T14:21:46Z)
Socially Responsible AI Algorithms: Issues, Purposes, and Challenges [31.382000425295885]
Technologists and AI researchers have a responsibility to develop trustworthy AI systems. To build long-lasting trust between AI and human beings, we argue that the key is to think beyond algorithmic fairness.
arXiv Detail & Related papers (2021-01-01T17:34:42Z)
AI Failures: A Review of Underlying Issues [0.0]
We focus on AI failures on account of flaws in conceptualization, design and deployment. We find that AI systems fail on account of omission and commission errors in the design of the AI system. An AI system is quite likely to fail in situations where, in effect, it is called upon to deliver moral judgments.
arXiv Detail & Related papers (2020-07-18T15:31:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.