The Challenge of Value Alignment: from Fairer Algorithms to AI Safety
- URL: http://arxiv.org/abs/2101.06060v2
- Date: Mon, 18 Jan 2021 11:36:10 GMT
- Title: The Challenge of Value Alignment: from Fairer Algorithms to AI Safety
- Authors: Iason Gabriel and Vafa Ghazavi
- Abstract summary: This paper addresses the question of how to align AI systems with human values.
It situates it within a wider body of thought regarding technology and value.
- Score: 2.28438857884398
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper addresses the question of how to align AI systems with human
values and situates it within a wider body of thought regarding technology and
value. Far from existing in a vacuum, there has long been an interest in the
ability of technology to 'lock-in' different value systems. There has also been
considerable thought about how to align technologies with specific social
values, including through participatory design-processes. In this paper we look
more closely at the question of AI value alignment and suggest that the power
and autonomy of AI systems gives rise to opportunities and challenges in the
domain of value that have not been encountered before. Drawing important
continuities between the work of the fairness, accountability, transparency and
ethics community, and work being done by technical AI safety researchers, we
suggest that more attention needs to be paid to the question of 'social value
alignment' - that is, how to align AI systems with the plurality of values
endorsed by groups of people, especially on the global level.
Related papers
- Imagining and building wise machines: The centrality of AI metacognition [78.76893632793497]
We argue that shortcomings stem from one overarching failure: AI systems lack wisdom.
While AI research has focused on task-level strategies, metacognition is underdeveloped in AI systems.
We propose that integrating metacognitive capabilities into AI systems is crucial for enhancing their robustness, explainability, cooperation, and safety.
arXiv Detail & Related papers (2024-11-04T18:10:10Z) - ValueCompass: A Framework of Fundamental Values for Human-AI Alignment [15.35489011078817]
We introduce Value, a framework of fundamental values, grounded in psychological theory and a systematic review.
We apply Value to measure the value alignment of humans and language models (LMs) across four real-world vignettes.
Our findings uncover risky misalignment between humans and LMs, such as LMs agreeing with values like "Choose Own Goals", which are largely disagreed by humans.
arXiv Detail & Related papers (2024-09-15T02:13:03Z) - Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions [101.67121669727354]
Recent advancements in AI have highlighted the importance of guiding AI systems towards the intended goals, ethical principles, and values of individuals and groups, a concept broadly recognized as alignment.
The lack of clarified definitions and scopes of human-AI alignment poses a significant obstacle, hampering collaborative efforts across research domains to achieve this alignment.
We introduce a systematic review of over 400 papers published between 2019 and January 2024, spanning multiple domains such as Human-Computer Interaction (HCI), Natural Language Processing (NLP), Machine Learning (ML)
arXiv Detail & Related papers (2024-06-13T16:03:25Z) - Are You Worthy of My Trust?: A Socioethical Perspective on the Impacts
of Trustworthy AI Systems on the Environment and Human Society [0.47138177023764666]
We offer a brief, high-level overview of societal impacts of AI systems.
We highlight the requirement of multi-disciplinary governance and convergence throughout its lifecycle.
arXiv Detail & Related papers (2023-09-18T03:07:47Z) - Connecting the Dots in Trustworthy Artificial Intelligence: From AI
Principles, Ethics, and Key Requirements to Responsible AI Systems and
Regulation [22.921683578188645]
We argue that attaining truly trustworthy AI concerns the trustworthiness of all processes and actors that are part of the system's life cycle.
A more holistic vision contemplates four essential axes: the global principles for ethical use and development of AI-based systems, a philosophical take on AI ethics, and a risk-based approach to AI regulation.
Our multidisciplinary vision of trustworthy AI culminates in a debate on the diverging views published lately about the future of AI.
arXiv Detail & Related papers (2023-05-02T09:49:53Z) - Fairness in Agreement With European Values: An Interdisciplinary
Perspective on AI Regulation [61.77881142275982]
This interdisciplinary position paper considers various concerns surrounding fairness and discrimination in AI, and discusses how AI regulations address them.
We first look at AI and fairness through the lenses of law, (AI) industry, sociotechnology, and (moral) philosophy, and present various perspectives.
We identify and propose the roles AI Regulation should take to make the endeavor of the AI Act a success in terms of AI fairness concerns.
arXiv Detail & Related papers (2022-06-08T12:32:08Z) - Metaethical Perspectives on 'Benchmarking' AI Ethics [81.65697003067841]
Benchmarks are seen as the cornerstone for measuring technical progress in Artificial Intelligence (AI) research.
An increasingly prominent research area in AI is ethics, which currently has no set of benchmarks nor commonly accepted way for measuring the 'ethicality' of an AI system.
We argue that it makes more sense to talk about 'values' rather than 'ethics' when considering the possible actions of present and future AI systems.
arXiv Detail & Related papers (2022-04-11T14:36:39Z) - Trustworthy AI: From Principles to Practices [44.67324097900778]
Many current AI systems were found vulnerable to imperceptible attacks, biased against underrepresented groups, lacking in user privacy protection, etc.
In this review, we strive to provide AI practitioners a comprehensive guide towards building trustworthy AI systems.
To unify the current fragmented approaches towards trustworthy AI, we propose a systematic approach that considers the entire lifecycle of AI systems.
arXiv Detail & Related papers (2021-10-04T03:20:39Z) - Building Bridges: Generative Artworks to Explore AI Ethics [56.058588908294446]
In recent years, there has been an increased emphasis on understanding and mitigating adverse impacts of artificial intelligence (AI) technologies on society.
A significant challenge in the design of ethical AI systems is that there are multiple stakeholders in the AI pipeline, each with their own set of constraints and interests.
This position paper outlines some potential ways in which generative artworks can play this role by serving as accessible and powerful educational tools.
arXiv Detail & Related papers (2021-06-25T22:31:55Z) - An interdisciplinary conceptual study of Artificial Intelligence (AI)
for helping benefit-risk assessment practices: Towards a comprehensive
qualification matrix of AI programs and devices (pre-print 2020) [55.41644538483948]
This paper proposes a comprehensive analysis of existing concepts coming from different disciplines tackling the notion of intelligence.
The aim is to identify shared notions or discrepancies to consider for qualifying AI systems.
arXiv Detail & Related papers (2021-05-07T12:01:31Z) - AI loyalty: A New Paradigm for Aligning Stakeholder Interests [0.0]
We argue that AI loyalty should be considered during the technological design process alongside other important values in AI ethics.
We discuss a range of mechanisms that could support incorporation of AI loyalty into a variety of future AI systems.
arXiv Detail & Related papers (2020-03-24T23:55:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.