Related papers: The Subject of Emergent Misalignment in Superintelligence: An Anthropological, Cognitive Neuropsychological, Machine-Learning, and Ontological Perspective

The Subject of Emergent Misalignment in Superintelligence: An Anthropological, Cognitive Neuropsychological, Machine-Learning, and Ontological Perspective

URL: http://arxiv.org/abs/2512.17989v1
Date: Fri, 19 Dec 2025 17:43:25 GMT
Title: The Subject of Emergent Misalignment in Superintelligence: An Anthropological, Cognitive Neuropsychological, Machine-Learning, and Ontological Perspective
Authors: Muhammad Osama Imran, Roshni Lulla, Rodney Sappington,
Abstract summary: We find throughout Superintelligence discourse an absent human subject, and an under-developed theorization of an "AI unconscious"<n>We ask: what place does the human subject occupy in these imaginaries?<n>Are we to blame these agents in opting for deceptive strategies when undesirable patterns are inherent within our beings?
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We examine the conceptual and ethical gaps in current representations of Superintelligence misalignment. We find throughout Superintelligence discourse an absent human subject, and an under-developed theorization of an "AI unconscious" that together are potentiality laying the groundwork for anti-social harm. With the rise of AI Safety that has both thematic potential for establishing pro-social and anti-social potential outcomes, we ask: what place does the human subject occupy in these imaginaries? How is human subjecthood positioned within narratives of catastrophic failure or rapid "takeoff" toward superintelligence? On another register, we ask: what unconscious or repressed dimensions are being inscribed into large-scale AI models? Are we to blame these agents in opting for deceptive strategies when undesirable patterns are inherent within our beings? In tracing these psychic and epistemic absences, our project calls for re-centering the human subject as the unstable ground upon which the ethical, unconscious, and misaligned dimensions of both human and machinic intelligence are co-constituted. Emergent misalignment cannot be understood solely through technical diagnostics typical of contemporary machine-learning safety research. Instead, it represents a multi-layered crisis. The human subject disappears not only through computational abstraction but through sociotechnical imaginaries that prioritize scalability, acceleration, and efficiency over vulnerability, finitude, and relationality. Likewise, the AI unconscious emerges not as a metaphor but as a structural reality of modern deep learning systems: vast latent spaces, opaque pattern formation, recursive symbolic play, and evaluation-sensitive behavior that surpasses explicit programming. These dynamics necessitate a reframing of misalignment as a relational instability embedded within human-machine ecologies.

Related papers

The Psychology of Learning from Machines: Anthropomorphic AI and the Paradox of Automation in Education [1.8217623940980625]
This work synthesizes four research traditions to establish a comprehensive framework for understanding how learners psychologically relate to anthropomorphic AI tutors.<n>We identify three persistent challenges intensified by Generative AI's conversational fluency.<n>We ground this theoretical synthesis through comparative analysis of over 104,984 YouTube comments across AI-generated philosophical debates and human-created engineering tutorials.
arXiv Detail & Related papers (2026-01-07T08:28:33Z)
AI Deception: Risks, Dynamics, and Controls [153.71048309527225]
This project provides a comprehensive and up-to-date overview of the AI deception field.<n>We identify a formal definition of AI deception, grounded in signaling theory from studies of animal deception.<n>We organize the landscape of AI deception research as a deception cycle, consisting of two key components: deception emergence and deception treatment.
arXiv Detail & Related papers (2025-11-27T16:56:04Z)
Mind Meets Space: Rethinking Agentic Spatial Intelligence from a Neuroscience-inspired Perspective [53.556348738917166]
Recent advances in agentic AI have led to systems capable of autonomous task execution and language-based reasoning.<n>Human spatial intelligence, rooted in integrated multisensory perception, spatial memory, and cognitive maps, enables flexible, context-aware decision-making in unstructured environments.
arXiv Detail & Related papers (2025-09-11T05:23:22Z)
What Does 'Human-Centred AI' Mean? [0.0]
AI is usefully seen as a relationship between technology and humans.<n>All AI implicates human cognition; no matter what.<n>To even begin to de-fetishise AI, we must look the human-in-the-loop in the eyes.
arXiv Detail & Related papers (2025-07-26T14:18:52Z)
Neural Brain: A Neuroscience-inspired Framework for Embodied Agents [78.61382193420914]
Current AI systems, such as large language models, remain disembodied, unable to physically engage with the world.<n>At the core of this challenge lies the concept of Neural Brain, a central intelligence system designed to drive embodied agents with human-like adaptability.<n>This paper introduces a unified framework for the Neural Brain of embodied agents, addressing two fundamental challenges.
arXiv Detail & Related papers (2025-05-12T15:05:34Z)
Analyzing Advanced AI Systems Against Definitions of Life and Consciousness [0.0]
We propose a number of metrics for examining whether an advanced AI system has gained consciousness.<n>We suggest that sufficiently advanced architectures exhibiting immune like sabotage defenses, mirror self-recognition analogs, or meta-cognitive updates may cross key thresholds akin to life-like or consciousness-like traits.
arXiv Detail & Related papers (2025-02-07T15:27:34Z)
Imagining and building wise machines: The centrality of AI metacognition [78.76893632793497]
We examine what is known about human wisdom and sketch a vision of its AI counterpart.<n>We argue that AI systems particularly struggle with metacognition.<n>We discuss how wise AI might be benchmarked, trained, and implemented.
arXiv Detail & Related papers (2024-11-04T18:10:10Z)
Enabling High-Level Machine Reasoning with Cognitive Neuro-Symbolic Systems [67.01132165581667]
We propose to enable high-level reasoning in AI systems by integrating cognitive architectures with external neuro-symbolic components. We illustrate a hybrid framework centered on ACT-R and we discuss the role of generative models in recent and future applications.
arXiv Detail & Related papers (2023-11-13T21:20:17Z)
AGENT: A Benchmark for Core Psychological Reasoning [60.35621718321559]
Intuitive psychology is the ability to reason about hidden mental variables that drive observable actions. Despite recent interest in machine agents that reason about other agents, it is not clear if such agents learn or hold the core psychology principles that drive human reasoning. We present a benchmark consisting of procedurally generated 3D animations, AGENT, structured around four scenarios.
arXiv Detail & Related papers (2021-02-24T14:58:23Z)
Modelos din\^amicos aplicados \`a aprendizagem de valores em intelig\^encia artificial [0.0]
Several researchers in the area have developed a robust, beneficial, and safe concept of AI for the preservation of humanity and the environment. It is utmost importance that artificial intelligent agents have their values aligned with human values. Perhaps this difficulty comes from the way we are addressing the problem of expressing values using cognitive methods.
arXiv Detail & Related papers (2020-07-30T00:56:11Z)
Machine Common Sense [77.34726150561087]
Machine common sense remains a broad, potentially unbounded problem in artificial intelligence (AI) This article deals with the aspects of modeling commonsense reasoning focusing on such domain as interpersonal interactions.
arXiv Detail & Related papers (2020-06-15T13:59:47Z)
Dynamic Cognition Applied to Value Learning in Artificial Intelligence [0.0]
Several researchers in the area are trying to develop a robust, beneficial, and safe concept of artificial intelligence. It is of utmost importance that artificial intelligent agents have their values aligned with human values. A possible approach to this problem would be to use theoretical models such as SED.
arXiv Detail & Related papers (2020-05-12T03:58:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.