A Multi-Level Framework for the AI Alignment Problem
- URL: http://arxiv.org/abs/2301.03740v1
- Date: Tue, 10 Jan 2023 01:09:07 GMT
- Title: A Multi-Level Framework for the AI Alignment Problem
- Authors: Betty Li Hou, Brian Patrick Green
- Abstract summary: We present a framework to consider the question at four levels: Individual, Organizational, National, and Global.
We outline key questions and considerations of each level and demonstrate an application of this framework to the topic of AI content moderation.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: AI alignment considers how we can encode AI systems in a way that is
compatible with human values. The normative side of this problem asks what
moral values or principles, if any, we should encode in AI. To this end, we
present a framework to consider the question at four levels: Individual,
Organizational, National, and Global. We aim to illustrate how AI alignment is
made up of value alignment problems at each of these levels, where values at
each level affect the others and effects can flow in either direction. We
outline key questions and considerations of each level and demonstrate an
application of this framework to the topic of AI content moderation.
Related papers
- Aligning Generalisation Between Humans and Machines [74.120848518198]
AI technology can support humans in scientific discovery and forming decisions, but may also disrupt democracies and target individuals.<n>The responsible use of AI and its participation in human-AI teams increasingly shows the need for AI alignment.<n>A crucial yet often overlooked aspect of these interactions is the different ways in which humans and machines generalise.
arXiv Detail & Related papers (2024-11-23T18:36:07Z) - Using AI Alignment Theory to understand the potential pitfalls of regulatory frameworks [55.2480439325792]
This paper critically examines the European Union's Artificial Intelligence Act (EU AI Act)
Uses insights from Alignment Theory (AT) research, which focuses on the potential pitfalls of technical alignment in Artificial Intelligence.
As we apply these concepts to the EU AI Act, we uncover potential vulnerabilities and areas for improvement in the regulation.
arXiv Detail & Related papers (2024-10-10T17:38:38Z) - Combining AI Control Systems and Human Decision Support via Robustness and Criticality [53.10194953873209]
We extend a methodology for adversarial explanations (AE) to state-of-the-art reinforcement learning frameworks.
We show that the learned AI control system demonstrates robustness against adversarial tampering.
In a training / learning framework, this technology can improve both the AI's decisions and explanations through human interaction.
arXiv Detail & Related papers (2024-07-03T15:38:57Z) - Dynamic Normativity: Necessary and Sufficient Conditions for Value Alignment [0.0]
We find "alignment" a problem related to the challenges of expressing human goals and values in a manner that artificial systems can follow without leading to unwanted adversarial effects.
This work addresses alignment as a technical-philosophical problem that requires solid philosophical foundations and practical implementations that bring normative theory to AI system development.
arXiv Detail & Related papers (2024-06-16T18:37:31Z) - Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions [101.67121669727354]
Recent advancements in AI have highlighted the importance of guiding AI systems towards the intended goals, ethical principles, and values of individuals and groups, a concept broadly recognized as alignment.
The lack of clarified definitions and scopes of human-AI alignment poses a significant obstacle, hampering collaborative efforts across research domains to achieve this alignment.
We introduce a systematic review of over 400 papers published between 2019 and January 2024, spanning multiple domains such as Human-Computer Interaction (HCI), Natural Language Processing (NLP), Machine Learning (ML)
arXiv Detail & Related papers (2024-06-13T16:03:25Z) - Foundational Moral Values for AI Alignment [0.0]
We present five core, foundational values, drawn from moral philosophy and built on the requisites for human existence: survival, sustainable intergenerational existence, society, education, and truth.
We show that these values not only provide a clearer direction for technical alignment work, but also serve as a framework to highlight threats and opportunities from AI systems to both obtain and sustain these values.
arXiv Detail & Related papers (2023-11-28T18:11:24Z) - AI Alignment: A Comprehensive Survey [70.35693485015659]
AI alignment aims to make AI systems behave in line with human intentions and values.
We identify four principles as the key objectives of AI alignment: Robustness, Interpretability, Controllability, and Ethicality.
We decompose current alignment research into two key components: forward alignment and backward alignment.
arXiv Detail & Related papers (2023-10-30T15:52:15Z) - Fairness in Agreement With European Values: An Interdisciplinary
Perspective on AI Regulation [61.77881142275982]
This interdisciplinary position paper considers various concerns surrounding fairness and discrimination in AI, and discusses how AI regulations address them.
We first look at AI and fairness through the lenses of law, (AI) industry, sociotechnology, and (moral) philosophy, and present various perspectives.
We identify and propose the roles AI Regulation should take to make the endeavor of the AI Act a success in terms of AI fairness concerns.
arXiv Detail & Related papers (2022-06-08T12:32:08Z) - Metaethical Perspectives on 'Benchmarking' AI Ethics [81.65697003067841]
Benchmarks are seen as the cornerstone for measuring technical progress in Artificial Intelligence (AI) research.
An increasingly prominent research area in AI is ethics, which currently has no set of benchmarks nor commonly accepted way for measuring the 'ethicality' of an AI system.
We argue that it makes more sense to talk about 'values' rather than 'ethics' when considering the possible actions of present and future AI systems.
arXiv Detail & Related papers (2022-04-11T14:36:39Z) - The Challenge of Value Alignment: from Fairer Algorithms to AI Safety [2.28438857884398]
This paper addresses the question of how to align AI systems with human values.
It situates it within a wider body of thought regarding technology and value.
arXiv Detail & Related papers (2021-01-15T11:03:15Z) - Artificial Intelligence, Values and Alignment [2.28438857884398]
normative and technical aspects of the AI alignment problem are interrelated.
It is important to be clear about the goal of alignment.
The central challenge for theorists is not to identify 'true' moral principles for AI.
arXiv Detail & Related papers (2020-01-13T10:32:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.