Related papers: Foundational Moral Values for AI Alignment

Related papers

Infrastructuring Contestability: A Framework for Community-Defined AI Value Pluralism [0.0]
The proliferation of AI-driven systems presents a challenge to Human-Computer Interaction and Computer-Supported Cooperative Work.<n>Current approaches to value alignment, which rely on centralized, top-down definitions, lack the mechanisms for meaningful contestability.<n>This paper introduces Community-Defined AI Value Pluralism, a socio-technical framework that addresses this gap.
arXiv Detail & Related papers (2025-07-07T16:45:50Z)
Disentangling AI Alignment: A Structured Taxonomy Beyond Safety and Ethics [0.0]
We develop a structured conceptual framework for understanding AI alignment.<n>Rather than focusing solely on alignment goals, we introduce a taxonomy distinguishing the alignment aim (safety, ethicality, legality, etc.), scope (outcome vs. execution), and constituency (individual vs. collective)<n>This structural approach reveals multiple legitimate alignment configurations, providing a foundation for practical and philosophical integration across domains.
arXiv Detail & Related papers (2025-05-02T20:45:52Z)
Artificial Intelligence (AI) and the Relationship between Agency, Autonomy, and Moral Patiency [0.0]
We argue that while current AI systems are highly sophisticated, they lack genuine agency and autonomy. We do not rule out the possibility of future systems that could achieve a limited form of artificial moral agency without consciousness.
arXiv Detail & Related papers (2025-04-11T03:48:40Z)
Technology as uncharted territory: Contextual integrity and the notion of AI as new ethical ground [55.2480439325792]
I argue that efforts to promote responsible and ethical AI can inadvertently contribute to and seemingly legitimize this disregard for established contextual norms. I question the current narrow prioritization in AI ethics of moral innovation over moral preservation.
arXiv Detail & Related papers (2024-12-06T15:36:13Z)
Using AI Alignment Theory to understand the potential pitfalls of regulatory frameworks [55.2480439325792]
This paper critically examines the European Union's Artificial Intelligence Act (EU AI Act) Uses insights from Alignment Theory (AT) research, which focuses on the potential pitfalls of technical alignment in Artificial Intelligence. As we apply these concepts to the EU AI Act, we uncover potential vulnerabilities and areas for improvement in the regulation.
arXiv Detail & Related papers (2024-10-10T17:38:38Z)
ValueCompass: A Framework of Fundamental Values for Human-AI Alignment [15.35489011078817]
We introduce Value, a framework of fundamental values, grounded in psychological theory and a systematic review. We apply Value to measure the value alignment of humans and language models (LMs) across four real-world vignettes. Our findings uncover risky misalignment between humans and LMs, such as LMs agreeing with values like "Choose Own Goals", which are largely disagreed by humans.
arXiv Detail & Related papers (2024-09-15T02:13:03Z)
Dynamic Normativity: Necessary and Sufficient Conditions for Value Alignment [0.0]
We find "alignment" a problem related to the challenges of expressing human goals and values in a manner that artificial systems can follow without leading to unwanted adversarial effects. This work addresses alignment as a technical-philosophical problem that requires solid philosophical foundations and practical implementations that bring normative theory to AI system development.
arXiv Detail & Related papers (2024-06-16T18:37:31Z)
Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions [101.67121669727354]
Recent advancements in AI have highlighted the importance of guiding AI systems towards the intended goals, ethical principles, and values of individuals and groups, a concept broadly recognized as alignment. The lack of clarified definitions and scopes of human-AI alignment poses a significant obstacle, hampering collaborative efforts across research domains to achieve this alignment. We introduce a systematic review of over 400 papers published between 2019 and January 2024, spanning multiple domains such as Human-Computer Interaction (HCI), Natural Language Processing (NLP), Machine Learning (ML)
arXiv Detail & Related papers (2024-06-13T16:03:25Z)
Hybrid Approaches for Moral Value Alignment in AI Agents: a Manifesto [3.7414804164475983]
Increasing interest in ensuring the safety of next-generation Artificial Intelligence (AI) systems calls for novel approaches to embedding morality into autonomous agents. We provide a systematization of existing approaches to the problem of introducing morality in machines - modelled as a continuum. We argue that more hybrid solutions are needed to create adaptable and robust, yet controllable and interpretable agentic systems.
arXiv Detail & Related papers (2023-12-04T11:46:34Z)
AI Alignment: A Comprehensive Survey [70.35693485015659]
AI alignment aims to make AI systems behave in line with human intentions and values. We identify four principles as the key objectives of AI alignment: Robustness, Interpretability, Controllability, and Ethicality. We decompose current alignment research into two key components: forward alignment and backward alignment.
arXiv Detail & Related papers (2023-10-30T15:52:15Z)
Connecting the Dots in Trustworthy Artificial Intelligence: From AI Principles, Ethics, and Key Requirements to Responsible AI Systems and Regulation [22.921683578188645]
We argue that attaining truly trustworthy AI concerns the trustworthiness of all processes and actors that are part of the system's life cycle. A more holistic vision contemplates four essential axes: the global principles for ethical use and development of AI-based systems, a philosophical take on AI ethics, and a risk-based approach to AI regulation. Our multidisciplinary vision of trustworthy AI culminates in a debate on the diverging views published lately about the future of AI.
arXiv Detail & Related papers (2023-05-02T09:49:53Z)
Fairness in Agreement With European Values: An Interdisciplinary Perspective on AI Regulation [61.77881142275982]
This interdisciplinary position paper considers various concerns surrounding fairness and discrimination in AI, and discusses how AI regulations address them. We first look at AI and fairness through the lenses of law, (AI) industry, sociotechnology, and (moral) philosophy, and present various perspectives. We identify and propose the roles AI Regulation should take to make the endeavor of the AI Act a success in terms of AI fairness concerns.
arXiv Detail & Related papers (2022-06-08T12:32:08Z)
Metaethical Perspectives on 'Benchmarking' AI Ethics [81.65697003067841]
Benchmarks are seen as the cornerstone for measuring technical progress in Artificial Intelligence (AI) research. An increasingly prominent research area in AI is ethics, which currently has no set of benchmarks nor commonly accepted way for measuring the 'ethicality' of an AI system. We argue that it makes more sense to talk about 'values' rather than 'ethics' when considering the possible actions of present and future AI systems.
arXiv Detail & Related papers (2022-04-11T14:36:39Z)
An interdisciplinary conceptual study of Artificial Intelligence (AI) for helping benefit-risk assessment practices: Towards a comprehensive qualification matrix of AI programs and devices (pre-print 2020) [55.41644538483948]
This paper proposes a comprehensive analysis of existing concepts coming from different disciplines tackling the notion of intelligence. The aim is to identify shared notions or discrepancies to consider for qualifying AI systems.
arXiv Detail & Related papers (2021-05-07T12:01:31Z)
The Challenge of Value Alignment: from Fairer Algorithms to AI Safety [2.28438857884398]
This paper addresses the question of how to align AI systems with human values. It situates it within a wider body of thought regarding technology and value.
arXiv Detail & Related papers (2021-01-15T11:03:15Z)
Artificial Intelligence, Values and Alignment [2.28438857884398]
normative and technical aspects of the AI alignment problem are interrelated. It is important to be clear about the goal of alignment. The central challenge for theorists is not to identify 'true' moral principles for AI.
arXiv Detail & Related papers (2020-01-13T10:32:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.