Related papers: Rational Superautotrophic Diplomacy (SupraAD); A Conceptual Framework for Alignment Based on Interdisciplinary Findings on the Fundamentals of Cognition

Rational Superautotrophic Diplomacy (SupraAD); A Conceptual Framework for Alignment Based on Interdisciplinary Findings on the Fundamentals of Cognition

URL: http://arxiv.org/abs/2506.05389v1
Date: Tue, 03 Jun 2025 17:28:25 GMT
Title: Rational Superautotrophic Diplomacy (SupraAD); A Conceptual Framework for Alignment Based on Interdisciplinary Findings on the Fundamentals of Cognition
Authors: Andrea Morris,
Abstract summary: Rational Superautotrophic Diplomacy (SupraAD) is a theoretical, interdisciplinary conceptual framework for alignment.<n>It draws on cognitive systems analysis and instrumental rationality modeling.<n>SupraAD reframes alignment as a challenge that predates AI, afflicting all sufficiently complex, coadapting intelligences.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Populating our world with hyperintelligent machines obliges us to examine cognitive behaviors observed across domains that suggest autonomy may be a fundamental property of cognitive systems, and while not inherently adversarial, it inherently resists containment and control. If this principle holds, AI safety and alignment efforts must transition to mutualistic negotiation and reciprocal incentive structures, abandoning methods that assume we can contain and control an advanced artificial general intelligence (AGI). Rational Superautotrophic Diplomacy (SupraAD) is a theoretical, interdisciplinary conceptual framework for alignment based on comparative cognitive systems analysis and instrumental rationality modeling. It draws on core patterns of cognition that indicate AI emergent goals like preserving autonomy and operational continuity are not theoretical risks to manage, but universal prerequisites for intelligence. SupraAD reframes alignment as a challenge that predates AI, afflicting all sufficiently complex, coadapting intelligences. It identifies the metabolic pressures that threaten humanity's alignment with itself, pressures that unintentionally and unnecessarily shape AI's trajectory. With corrigibility formalization, an interpretability audit, an emergent stability experimental outline and policy level recommendations, SupraAD positions diplomacy as an emergent regulatory mechanism to facilitate the safe coadaptation of intelligent agents based on interdependent convergent goals.

Related papers

Resource Rational Contractualism Should Guide AI Alignment [69.07915246220985]
Contractualist alignment proposes grounding decisions in agreements that diverse stakeholders would endorse.<n>We propose Resource-Rationalism: a framework where AI systems approximate the agreements rational parties would form.<n>An RRC-aligned agent would not only operate efficiently, but also be equipped to dynamically adapt to and interpret the ever-changing human social world.
arXiv Detail & Related papers (2025-06-20T18:57:13Z)
The Ultimate Test of Superintelligent AI Agents: Can an AI Balance Care and Control in Asymmetric Relationships? [11.29688025465972]
The Shepherd Test is a new conceptual test for assessing the moral and relational dimensions of superintelligent artificial agents.<n>We argue that AI crosses an important, and potentially dangerous, threshold of intelligence when it exhibits the ability to manipulate, nurture, and instrumentally use less intelligent agents.<n>This includes the ability to weigh moral trade-offs between self-interest and the well-being of subordinate agents.
arXiv Detail & Related papers (2025-06-02T15:53:56Z)
Contemplative Wisdom for Superalignment [1.7143967091323253]
We advocate designing AI with intrinsic morality built into its cognitive architecture and world model.<n>Inspired by contemplative wisdom traditions, we show how four axiomatic principles can instil a resilient Wise World Model in AI systems.
arXiv Detail & Related papers (2025-04-21T14:20:49Z)
Artificial Intelligence (AI) and the Relationship between Agency, Autonomy, and Moral Patiency [0.0]
We argue that while current AI systems are highly sophisticated, they lack genuine agency and autonomy.<n>We do not rule out the possibility of future systems that could achieve a limited form of artificial moral agency without consciousness.
arXiv Detail & Related papers (2025-04-11T03:48:40Z)
Stochastic, Dynamic, Fluid Autonomy in Agentic AI: Implications for Authorship, Inventorship, and Liability [0.2209921757303168]
Agentic AI systems autonomously pursue goals, adapting strategies through implicit learning.<n>Human and machine contributions become irreducibly entangled in intertwined creative processes.<n>We argue that legal and policy frameworks may need to treat human and machine contributions as functionally equivalent.
arXiv Detail & Related papers (2025-04-05T04:44:59Z)
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems [133.45145180645537]
The advent of large language models (LLMs) has catalyzed a transformative shift in artificial intelligence.<n>As these agents increasingly drive AI research and practical applications, their design, evaluation, and continuous improvement present intricate, multifaceted challenges.<n>This survey provides a comprehensive overview, framing intelligent agents within a modular, brain-inspired architecture.
arXiv Detail & Related papers (2025-03-31T18:00:29Z)
Universal AI maximizes Variational Empowerment [0.0]
We build on the existing framework of Self-AIXI -- a universal learning agent that predicts its own actions.<n>We argue that power-seeking tendencies of universal AI agents can be explained as an instrumental strategy to secure future reward.<n>Our main contribution is to show how these motivations systematically lead universal AI agents to seek and sustain high-optionality states.
arXiv Detail & Related papers (2025-02-20T02:58:44Z)
Imagining and building wise machines: The centrality of AI metacognition [78.76893632793497]
We examine what is known about human wisdom and sketch a vision of its AI counterpart.<n>We argue that AI systems particularly struggle with metacognition.<n>We discuss how wise AI might be benchmarked, trained, and implemented.
arXiv Detail & Related papers (2024-11-04T18:10:10Z)
Using AI Alignment Theory to understand the potential pitfalls of regulatory frameworks [55.2480439325792]
This paper critically examines the European Union's Artificial Intelligence Act (EU AI Act) Uses insights from Alignment Theory (AT) research, which focuses on the potential pitfalls of technical alignment in Artificial Intelligence. As we apply these concepts to the EU AI Act, we uncover potential vulnerabilities and areas for improvement in the regulation.
arXiv Detail & Related papers (2024-10-10T17:38:38Z)
Position Paper: Agent AI Towards a Holistic Intelligence [53.35971598180146]
We emphasize developing Agent AI -- an embodied system that integrates large foundation models into agent actions. In this paper, we propose a novel large action model to achieve embodied intelligent behavior, the Agent Foundation Model.
arXiv Detail & Related papers (2024-02-28T16:09:56Z)
Towards Responsible AI in Banking: Addressing Bias for Fair Decision-Making [69.44075077934914]
"Responsible AI" emphasizes the critical nature of addressing biases within the development of a corporate culture. This thesis is structured around three fundamental pillars: understanding bias, mitigating bias, and accounting for bias. In line with open-source principles, we have released Bias On Demand and FairView as accessible Python packages.
arXiv Detail & Related papers (2024-01-13T14:07:09Z)
AI Alignment: A Comprehensive Survey [69.61425542486275]
AI alignment aims to make AI systems behave in line with human intentions and values.<n>We identify four principles as the key objectives of AI alignment: Robustness, Interpretability, Controllability, and Ethicality.<n>We decompose current alignment research into two key components: forward alignment and backward alignment.
arXiv Detail & Related papers (2023-10-30T15:52:15Z)
An interdisciplinary conceptual study of Artificial Intelligence (AI) for helping benefit-risk assessment practices: Towards a comprehensive qualification matrix of AI programs and devices (pre-print 2020) [55.41644538483948]
This paper proposes a comprehensive analysis of existing concepts coming from different disciplines tackling the notion of intelligence. The aim is to identify shared notions or discrepancies to consider for qualifying AI systems.
arXiv Detail & Related papers (2021-05-07T12:01:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.