AI Alignment in the Design of Interactive AI: Specification Alignment,
Process Alignment, and Evaluation Support
- URL: http://arxiv.org/abs/2311.00710v1
- Date: Mon, 23 Oct 2023 14:33:11 GMT
- Title: AI Alignment in the Design of Interactive AI: Specification Alignment,
Process Alignment, and Evaluation Support
- Authors: Michael Terry, Chinmay Kulkarni, Martin Wattenberg, Lucas Dixon,
Meredith Ringel Morris
- Abstract summary: AI alignment considers the overall problem of ensuring an AI produces desired outcomes, without undesirable side effects.
This paper maps concepts from AI alignment onto a basic, three step interaction cycle.
We show how interfaces that provide interactive alignment mechanisms can lead to qualitatively different and improved user experiences.
- Score: 32.828851258409216
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AI alignment considers the overall problem of ensuring an AI produces desired
outcomes, without undesirable side effects. While often considered from the
perspectives of safety and human values, AI alignment can also be considered in
the context of designing and evaluating interfaces for interactive AI systems.
This paper maps concepts from AI alignment onto a basic, three step interaction
cycle, yielding a corresponding set of alignment objectives: 1) specification
alignment: ensuring the user can efficiently and reliably communicate
objectives to the AI, 2) process alignment: providing the ability to verify and
optionally control the AI's execution process, and 3) evaluation support:
ensuring the user can verify and understand the AI's output. We also introduce
the concepts of a surrogate process, defined as a simplified, separately
derived, but controllable representation of the AI's actual process; and the
notion of a Process Gulf, which highlights how differences between human and AI
processes can lead to challenges in AI control. To illustrate the value of this
framework, we describe commercial and research systems along each of the three
alignment dimensions, and show how interfaces that provide interactive
alignment mechanisms can lead to qualitatively different and improved user
experiences.
Related papers
- Combining AI Control Systems and Human Decision Support via Robustness and Criticality [53.10194953873209]
We extend a methodology for adversarial explanations (AE) to state-of-the-art reinforcement learning frameworks.
We show that the learned AI control system demonstrates robustness against adversarial tampering.
In a training / learning framework, this technology can improve both the AI's decisions and explanations through human interaction.
arXiv Detail & Related papers (2024-07-03T15:38:57Z) - Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions [101.67121669727354]
Recent advancements in AI have highlighted the importance of guiding AI systems towards the intended goals, ethical principles, and values of individuals and groups, a concept broadly recognized as alignment.
The lack of clarified definitions and scopes of human-AI alignment poses a significant obstacle, hampering collaborative efforts across research domains to achieve this alignment.
arXiv Detail & Related papers (2024-06-13T16:03:25Z) - Contestable AI needs Computational Argumentation [15.15970495693702]
State-of-the-art approaches predominantly neglect the need for AI systems to be contestable.
We argue that contestable AI requires dynamic (human-machine and/or machine-machine) explainability and decision-making processes.
arXiv Detail & Related papers (2024-05-17T12:23:18Z) - The Foundations of Computational Management: A Systematic Approach to
Task Automation for the Integration of Artificial Intelligence into Existing
Workflows [55.2480439325792]
This article introduces Computational Management, a systematic approach to task automation.
The article offers three easy step-by-step procedures to begin the process of implementing AI within a workflow.
arXiv Detail & Related papers (2024-02-07T01:45:14Z) - Fairness in Agreement With European Values: An Interdisciplinary
Perspective on AI Regulation [61.77881142275982]
This interdisciplinary position paper considers various concerns surrounding fairness and discrimination in AI, and discusses how AI regulations address them.
We first look at AI and fairness through the lenses of law, (AI) industry, sociotechnology, and (moral) philosophy, and present various perspectives.
We identify and propose the roles AI Regulation should take to make the endeavor of the AI Act a success in terms of AI fairness concerns.
arXiv Detail & Related papers (2022-06-08T12:32:08Z) - A Human-Centric Assessment Framework for AI [11.065260433086024]
There is no agreed standard on how explainable AI systems should be assessed.
Inspired by the Turing test, we introduce a human-centric assessment framework.
This setup can serve as framework for a wide range of human-centric AI system assessments.
arXiv Detail & Related papers (2022-05-25T12:59:13Z) - Cybertrust: From Explainable to Actionable and Interpretable AI (AI2) [58.981120701284816]
Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations.
It will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making.
arXiv Detail & Related papers (2022-01-26T18:53:09Z) - An interdisciplinary conceptual study of Artificial Intelligence (AI)
for helping benefit-risk assessment practices: Towards a comprehensive
qualification matrix of AI programs and devices (pre-print 2020) [55.41644538483948]
This paper proposes a comprehensive analysis of existing concepts coming from different disciplines tackling the notion of intelligence.
The aim is to identify shared notions or discrepancies to consider for qualifying AI systems.
arXiv Detail & Related papers (2021-05-07T12:01:31Z) - Artificial Intelligence, Values and Alignment [2.28438857884398]
normative and technical aspects of the AI alignment problem are interrelated.
It is important to be clear about the goal of alignment.
The central challenge for theorists is not to identify 'true' moral principles for AI.
arXiv Detail & Related papers (2020-01-13T10:32:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.