Artificial Intelligence, Values and Alignment
- URL: http://arxiv.org/abs/2001.09768v2
- Date: Mon, 5 Oct 2020 12:03:19 GMT
- Title: Artificial Intelligence, Values and Alignment
- Authors: Iason Gabriel
- Abstract summary: normative and technical aspects of the AI alignment problem are interrelated.
It is important to be clear about the goal of alignment.
The central challenge for theorists is not to identify 'true' moral principles for AI.
- Score: 2.28438857884398
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper looks at philosophical questions that arise in the context of AI
alignment. It defends three propositions. First, normative and technical
aspects of the AI alignment problem are interrelated, creating space for
productive engagement between people working in both domains. Second, it is
important to be clear about the goal of alignment. There are significant
differences between AI that aligns with instructions, intentions, revealed
preferences, ideal preferences, interests and values. A principle-based
approach to AI alignment, which combines these elements in a systematic way,
has considerable advantages in this context. Third, the central challenge for
theorists is not to identify 'true' moral principles for AI; rather, it is to
identify fair principles for alignment, that receive reflective endorsement
despite widespread variation in people's moral beliefs. The final part of the
paper explores three ways in which fair principles for AI alignment could
potentially be identified.
Related papers
- Using AI Alignment Theory to understand the potential pitfalls of regulatory frameworks [55.2480439325792]
This paper critically examines the European Union's Artificial Intelligence Act (EU AI Act)
Uses insights from Alignment Theory (AT) research, which focuses on the potential pitfalls of technical alignment in Artificial Intelligence.
As we apply these concepts to the EU AI Act, we uncover potential vulnerabilities and areas for improvement in the regulation.
arXiv Detail & Related papers (2024-10-10T17:38:38Z) - Beyond Preferences in AI Alignment [15.878773061188516]
We characterize and challenge the preferentist approach to AI alignment.
We show how preferences fail to capture the thick semantic content of human values.
We argue that AI systems should be aligned with normative standards appropriate to their social roles.
arXiv Detail & Related papers (2024-08-30T03:14:20Z) - Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions [101.67121669727354]
Recent advancements in AI have highlighted the importance of guiding AI systems towards the intended goals, ethical principles, and values of individuals and groups, a concept broadly recognized as alignment.
The lack of clarified definitions and scopes of human-AI alignment poses a significant obstacle, hampering collaborative efforts across research domains to achieve this alignment.
We introduce a systematic review of over 400 papers published between 2019 and January 2024, spanning multiple domains such as Human-Computer Interaction (HCI), Natural Language Processing (NLP), Machine Learning (ML)
arXiv Detail & Related papers (2024-06-13T16:03:25Z) - AI Alignment: A Comprehensive Survey [70.35693485015659]
AI alignment aims to make AI systems behave in line with human intentions and values.
We identify four principles as the key objectives of AI alignment: Robustness, Interpretability, Controllability, and Ethicality.
We decompose current alignment research into two key components: forward alignment and backward alignment.
arXiv Detail & Related papers (2023-10-30T15:52:15Z) - Factoring the Matrix of Domination: A Critical Review and Reimagination
of Intersectionality in AI Fairness [55.037030060643126]
Intersectionality is a critical framework that allows us to examine how social inequalities persist.
We argue that adopting intersectionality as an analytical framework is pivotal to effectively operationalizing fairness.
arXiv Detail & Related papers (2023-03-16T21:02:09Z) - Beyond Bias and Compliance: Towards Individual Agency and Plurality of
Ethics in AI [0.0]
We argue that the way data is labeled plays an essential role in the way AI behaves.
We propose an alternative path that allows for the plurality of values and the freedom of individual expression.
arXiv Detail & Related papers (2023-02-23T16:33:40Z) - Fairness in Agreement With European Values: An Interdisciplinary
Perspective on AI Regulation [61.77881142275982]
This interdisciplinary position paper considers various concerns surrounding fairness and discrimination in AI, and discusses how AI regulations address them.
We first look at AI and fairness through the lenses of law, (AI) industry, sociotechnology, and (moral) philosophy, and present various perspectives.
We identify and propose the roles AI Regulation should take to make the endeavor of the AI Act a success in terms of AI fairness concerns.
arXiv Detail & Related papers (2022-06-08T12:32:08Z) - Metaethical Perspectives on 'Benchmarking' AI Ethics [81.65697003067841]
Benchmarks are seen as the cornerstone for measuring technical progress in Artificial Intelligence (AI) research.
An increasingly prominent research area in AI is ethics, which currently has no set of benchmarks nor commonly accepted way for measuring the 'ethicality' of an AI system.
We argue that it makes more sense to talk about 'values' rather than 'ethics' when considering the possible actions of present and future AI systems.
arXiv Detail & Related papers (2022-04-11T14:36:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.