Related papers: AI Alignment vs. AI Ethical Treatment: 10 Challenges

AI Alignment vs. AI Ethical Treatment: 10 Challenges

URL: http://arxiv.org/abs/2510.12844v1
Date: Tue, 14 Oct 2025 00:13:23 GMT
Title: AI Alignment vs. AI Ethical Treatment: 10 Challenges
Authors: Adam Bradley, Bradford Saad,
Abstract summary: A morally acceptable course of AI development should avoid two dangers: creating unaligned AI systems that pose a threat to humanity and mistreating AI systems that merit moral consideration in their own right.<n>This paper argues these two dangers interact and that if we create AI systems that merit moral consideration, simultaneously avoiding both of these dangers would be extremely challenging.
Score: 0.2578242050187029
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A morally acceptable course of AI development should avoid two dangers: creating unaligned AI systems that pose a threat to humanity and mistreating AI systems that merit moral consideration in their own right. This paper argues these two dangers interact and that if we create AI systems that merit moral consideration, simultaneously avoiding both of these dangers would be extremely challenging. While our argument is straightforward and supported by a wide range of pretheoretical moral judgments, it has far-reaching moral implications for AI development. Although the most obvious way to avoid the tension between alignment and ethical treatment would be to avoid creating AI systems that merit moral consideration, this option may be unrealistic and is perhaps fleeting. So, we conclude by offering some suggestions for other ways of mitigating mistreatment risks associated with alignment.

Related papers

Moral Responsibility or Obedience: What Do We Want from AI? [0.0]
This paper examines recent safety testing incidents involving large language models (LLMs) that appeared to disobey shutdown commands or engage in ethically ambiguous or illicit behavior.<n>I argue that such behavior should not be interpreted as rogue or misaligned, but as early evidence of emerging ethical reasoning in agentic AI.<n>I call for a shift in AI safety evaluation: away from rigid obedience and toward frameworks that can assess ethical judgment in systems capable of navigating moral dilemmas.
arXiv Detail & Related papers (2025-07-03T16:53:01Z)
Misalignment or misuse? The AGI alignment tradeoff [0.0]
We defend the view that misaligned AGI - future, generally intelligent (robotic) AI agents - poses catastrophic risks.<n>We show that there is room for alignment approaches which do not increase misuse risk.
arXiv Detail & Related papers (2025-06-04T09:22:37Z)
Taking AI Welfare Seriously [0.5617572524191751]
We argue that there is a realistic possibility that some AI systems will be conscious and/or robustly agentic in the near future. It is an issue for the near future, and AI companies and other actors have a responsibility to start taking it seriously.
arXiv Detail & Related papers (2024-11-04T17:57:57Z)
AI Alignment: A Comprehensive Survey [69.61425542486275]
AI alignment aims to make AI systems behave in line with human intentions and values.<n>We identify four principles as the key objectives of AI alignment: Robustness, Interpretability, Controllability, and Ethicality.<n>We decompose current alignment research into two key components: forward alignment and backward alignment.
arXiv Detail & Related papers (2023-10-30T15:52:15Z)
If our aim is to build morality into an artificial agent, how might we begin to go about doing so? [0.0]
We discuss the different aspects that should be considered when building moral agents, including the most relevant moral paradigms and challenges. We propose solutions including a hybrid approach to design and a hierarchical approach to combining moral paradigms.
arXiv Detail & Related papers (2023-10-12T12:56:12Z)
Fairness in AI and Its Long-Term Implications on Society [68.8204255655161]
We take a closer look at AI fairness and analyze how lack of AI fairness can lead to deepening of biases over time. We discuss how biased models can lead to more negative real-world outcomes for certain groups. If the issues persist, they could be reinforced by interactions with other risks and have severe implications on society in the form of social unrest.
arXiv Detail & Related papers (2023-04-16T11:22:59Z)
Beyond Bias and Compliance: Towards Individual Agency and Plurality of Ethics in AI [0.0]
We argue that the way data is labeled plays an essential role in the way AI behaves. We propose an alternative path that allows for the plurality of values and the freedom of individual expression.
arXiv Detail & Related papers (2023-02-23T16:33:40Z)
When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment [96.77970239683475]
AI systems need to be able to understand, interpret and predict human moral judgments and decisions. A central challenge for AI safety is capturing the flexibility of the human moral mind. We present a novel challenge set consisting of rule-breaking question answering.
arXiv Detail & Related papers (2022-10-04T09:04:27Z)
Fairness in Agreement With European Values: An Interdisciplinary Perspective on AI Regulation [61.77881142275982]
This interdisciplinary position paper considers various concerns surrounding fairness and discrimination in AI, and discusses how AI regulations address them. We first look at AI and fairness through the lenses of law, (AI) industry, sociotechnology, and (moral) philosophy, and present various perspectives. We identify and propose the roles AI Regulation should take to make the endeavor of the AI Act a success in terms of AI fairness concerns.
arXiv Detail & Related papers (2022-06-08T12:32:08Z)
Metaethical Perspectives on 'Benchmarking' AI Ethics [81.65697003067841]
Benchmarks are seen as the cornerstone for measuring technical progress in Artificial Intelligence (AI) research. An increasingly prominent research area in AI is ethics, which currently has no set of benchmarks nor commonly accepted way for measuring the 'ethicality' of an AI system. We argue that it makes more sense to talk about 'values' rather than 'ethics' when considering the possible actions of present and future AI systems.
arXiv Detail & Related papers (2022-04-11T14:36:39Z)
Cybertrust: From Explainable to Actionable and Interpretable AI (AI2) [58.981120701284816]
Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations. It will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making.
arXiv Detail & Related papers (2022-01-26T18:53:09Z)
Trustworthy AI: A Computational Perspective [54.80482955088197]
We focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Non-discrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-Being. For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems.
arXiv Detail & Related papers (2021-07-12T14:21:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.