Related papers: Beyond the Black Box: A Cognitive Architecture for Explainable and Aligned AI

Beyond the Black Box: A Cognitive Architecture for Explainable and Aligned AI

URL: http://arxiv.org/abs/2512.03072v1
Date: Thu, 27 Nov 2025 12:42:54 GMT
Title: Beyond the Black Box: A Cognitive Architecture for Explainable and Aligned AI
Authors: Hu Keyi,
Abstract summary: "Weight-Calculatism" is a novel cognitive architecture grounded in first principles.<n>Decision-making is formalized through an interpretable Weight-Calculation model.<n>Results indicate that the architecture achieves transparent, human-like reasoning and robust learning in unprecedented scenarios.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Current AI paradigms, as "architects of experience," face fundamental challenges in explainability and value alignment. This paper introduces "Weight-Calculatism," a novel cognitive architecture grounded in first principles, and demonstrates its potential as a viable pathway toward Artificial General Intelligence (AGI). The architecture deconstructs cognition into indivisible Logical Atoms and two fundamental operations: Pointing and Comparison. Decision-making is formalized through an interpretable Weight-Calculation model (Weight = Benefit * Probability), where all values are traceable to an auditable set of Initial Weights. This atomic decomposition enables radical explainability, intrinsic generality for novel situations, and traceable value alignment. We detail its implementation via a graph-algorithm-based computational engine and a global workspace workflow, supported by a preliminary code implementation and scenario validation. Results indicate that the architecture achieves transparent, human-like reasoning and robust learning in unprecedented scenarios, establishing a practical and theoretical foundation for building trustworthy and aligned AGI.

Related papers

Toward IIT-Inspired Consciousness in LLMs: A Reward-Based Learning Framework [7.582178041791117]
This paper investigates the implementation of a leading theory of consciousness, Integrated Information Theory (IIT), within language models via a reward-based learning paradigm.<n>We formulate a novel reward function that quantifies a text's causality, coherence and integration, characteristics associated with conscious processing.<n>On out of domain tasks, careful tuning achieves up to a 31% reduction in output length while preserving accuracy levels comparable to the base model.
arXiv Detail & Related papers (2026-01-30T10:07:58Z)
Foundations of Artificial Intelligence Frameworks: Notion and Limits of AGI [0.0]
We argue that artificial general intelligence cannot emerge from current neural network paradigms regardless of scale.<n>We propose a framework distinguishing existential facilities (computational substrate) from architectural organization.
arXiv Detail & Related papers (2025-11-23T16:18:13Z)
Video Event Reasoning and Prediction by Fusing World Knowledge from LLMs with Vision Foundation Models [10.1080193179562]
Current understanding models excel at recognizing "what" but fall short in high-level cognitive tasks like causal reasoning and future prediction.<n>We propose a novel framework that fuses a powerful Vision Foundation Model for deep visual perception with a Large Language Model (LLM) serving as a knowledge-driven reasoning core.
arXiv Detail & Related papers (2025-07-08T09:43:17Z)
KERAIA: An Adaptive and Explainable Framework for Dynamic Knowledge Representation and Reasoning [46.85451489222176]
KERAIA is a novel framework and software platform for symbolic knowledge engineering.<n>It addresses the persistent challenges of representing, reasoning with, and executing knowledge in dynamic, complex, and context-sensitive environments.
arXiv Detail & Related papers (2025-05-07T10:56:05Z)
Computational Reasoning of Large Language Models [51.629694188014064]
We introduce textbfTuring Machine Bench, a benchmark to assess the ability of Large Language Models (LLMs) to execute reasoning processes.<n> TMBench incorporates four key features: self-contained and knowledge-agnostic reasoning, a minimalistic multi-step structure, controllable difficulty, and a theoretical foundation based on Turing machine.
arXiv Detail & Related papers (2025-04-29T13:52:47Z)
Coding for Intelligence from the Perspective of Category [66.14012258680992]
Coding targets compressing and reconstructing data, and intelligence. Recent trends demonstrate the potential homogeneity of these two fields. We propose a novel problem of Coding for Intelligence from the category theory view.
arXiv Detail & Related papers (2024-07-01T07:05:44Z)
Data Science Principles for Interpretable and Explainable AI [0.7581664835990121]
Interpretable and interactive machine learning aims to make complex models more transparent and controllable. This review synthesizes key principles from the growing literature in this field.
arXiv Detail & Related papers (2024-05-17T05:32:27Z)
Hierarchical Invariance for Robust and Interpretable Vision Tasks at Larger Scales [54.78115855552886]
We show how to construct over-complete invariants with a Convolutional Neural Networks (CNN)-like hierarchical architecture. With the over-completeness, discriminative features w.r.t. the task can be adaptively formed in a Neural Architecture Search (NAS)-like manner. For robust and interpretable vision tasks at larger scales, hierarchical invariant representation can be considered as an effective alternative to traditional CNN and invariants.
arXiv Detail & Related papers (2024-02-23T16:50:07Z)
On Binding Objects to Symbols: Learning Physical Concepts to Understand Real from Fake [155.6741526791004]
We revisit the classic signal-to-symbol barrier in light of the remarkable ability of deep neural networks to generate synthetic data. We characterize physical objects as abstract concepts and use the previous analysis to show that physical objects can be encoded by finite architectures. We conclude that binding physical entities to digital identities is possible in finite time with finite resources.
arXiv Detail & Related papers (2022-07-25T17:21:59Z)
AIGenC: An AI generalisation model via creativity [1.933681537640272]
Inspired by cognitive theories of creativity, this paper introduces a computational model (AIGenC) It lays down the necessary components to enable artificial agents to learn, use and generate transferable representations. We discuss the model's capability to yield better out-of-distribution generalisation in artificial agents.
arXiv Detail & Related papers (2022-05-19T17:43:31Z)
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges [50.22269760171131]
The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods. This text is concerned with exposing pre-defined regularities through unified geometric principles. It provides a common mathematical framework to study the most successful neural network architectures, such as CNNs, RNNs, GNNs, and Transformers.
arXiv Detail & Related papers (2021-04-27T21:09:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.