Related papers: Toward IIT-Inspired Consciousness in LLMs: A Reward-Based Learning Framework

Toward IIT-Inspired Consciousness in LLMs: A Reward-Based Learning Framework

URL: http://arxiv.org/abs/2601.22786v1
Date: Fri, 30 Jan 2026 10:07:58 GMT
Title: Toward IIT-Inspired Consciousness in LLMs: A Reward-Based Learning Framework
Authors: Hamid Reza Akbari, Mohammad Hossein Sameti, Amir M. Mansourian, Mohammad Hossein Rohban, Hossein Sameti,
Abstract summary: This paper investigates the implementation of a leading theory of consciousness, Integrated Information Theory (IIT), within language models via a reward-based learning paradigm.<n>We formulate a novel reward function that quantifies a text's causality, coherence and integration, characteristics associated with conscious processing.<n>On out of domain tasks, careful tuning achieves up to a 31% reduction in output length while preserving accuracy levels comparable to the base model.
Score: 7.582178041791117
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The pursuit of Artificial General Intelligence (AGI) is a central goal in language model development, in which consciousness-like processing could serve as a key facilitator. While current language models are not conscious, they exhibit behaviors analogous to certain aspects of consciousness. This paper investigates the implementation of a leading theory of consciousness, Integrated Information Theory (IIT), within language models via a reward-based learning paradigm. IIT provides a formal, axiom-based mathematical framework for quantifying consciousness. Drawing inspiration from its core principles, we formulate a novel reward function that quantifies a text's causality, coherence and integration, characteristics associated with conscious processing. Empirically, it is found that optimizing for this IIT-inspired reward leads to more concise text generation. On out of domain tasks, careful tuning achieves up to a 31% reduction in output length while preserving accuracy levels comparable to the base model. In addition to primary task performance, the broader effects of this training methodology on the model's confidence calibration and test-time computational scaling is analyzed. The proposed framework offers significant practical advantages: it is conceptually simple, computationally efficient, requires no external data or auxiliary models, and leverages a general, capability-driven signal rather than task-specific heuristics. Code available at https://github.com/MH-Sameti/LLM_PostTraining.git

Related papers

A Brain-like Synergistic Core in LLMs Drives Behaviour and Learning [50.68188138112555]
We show that large language models spontaneously develop synergistic cores.<n>We find that areas in middle layers exhibit synergistic processing while early and late layers rely on redundancy.<n>This convergence suggests that synergistic information processing is a fundamental property of intelligence.
arXiv Detail & Related papers (2026-01-11T10:48:35Z)
SEAL: Self-Evolving Agentic Learning for Conversational Question Answering over Knowledge Graphs [28.59157823781425]
SEAL is a novel two-stage semantic parsing framework grounded in self-evolving agentic learning.<n> SEAL achieves state-of-the-art performance, especially in multi-hop reasoning, comparison, and aggregation tasks.<n>The results validate notable gains in both structural accuracy and computational efficiency.
arXiv Detail & Related papers (2025-12-04T14:52:30Z)
Beyond the Black Box: A Cognitive Architecture for Explainable and Aligned AI [0.0]
"Weight-Calculatism" is a novel cognitive architecture grounded in first principles.<n>Decision-making is formalized through an interpretable Weight-Calculation model.<n>Results indicate that the architecture achieves transparent, human-like reasoning and robust learning in unprecedented scenarios.
arXiv Detail & Related papers (2025-11-27T12:42:54Z)
Efficient Machine Unlearning via Influence Approximation [75.31015485113993]
Influence-based unlearning has emerged as a prominent approach to estimate the impact of individual training samples on model parameters without retraining.<n>This paper establishes a theoretical link between memorizing (incremental learning) and forgetting (unlearning)<n>We introduce the Influence Approximation Unlearning algorithm for efficient machine unlearning from the incremental perspective.
arXiv Detail & Related papers (2025-07-31T05:34:27Z)
A Theory of Inference Compute Scaling: Reasoning through Directed Stochastic Skill Search [15.387256204743407]
Large language models (LLMs) demand considerable computational, energy, and financial resources during both training and deployment.<n>Inference costs now represent a significant and growing component of the overall resource burden.<n>We introduce directed skill search (DS3), a general framework that represents inference as expressive over a learned skill graph.
arXiv Detail & Related papers (2025-06-10T14:47:48Z)
SA-GAT-SR: Self-Adaptable Graph Attention Networks with Symbolic Regression for high-fidelity material property prediction [1.4403769872061323]
We introduce a novel computational paradigm, Self-Adaptable Graph Attention Networks integrated with Symbolic Regression (SA-GAT-SR)<n>Our framework employs a self-adaptable encoding algorithm that automatically identifies and adjust attention weights so as to screen critical features from an expansive 180-dimensional feature space.<n>The integrated SR module subsequently distills these features into compact analytical expressions that explicitly reveal quantum-mechanically meaningful relationships.
arXiv Detail & Related papers (2025-05-01T16:05:10Z)
Generalising from Self-Produced Data: Model Training Beyond Human Constraints [0.0]
This paper introduces a novel framework in which AI models autonomously generate and validate new knowledge.<n>Central to this approach is an unbounded, ungamable numeric reward that guides learning without requiring human benchmarks.
arXiv Detail & Related papers (2025-04-07T03:48:02Z)
Learning Task Representations from In-Context Learning [67.66042137487287]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning (ICL)<n>We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads.<n>The proposed method successfully extracts task-specific information from in-context demonstrations and excels in both text and regression tasks.
arXiv Detail & Related papers (2025-02-08T00:16:44Z)
Erasing Conceptual Knowledge from Language Models [24.63143961814566]
We introduce Erasure of Language Memory (ELM), a principled approach to concept-level unlearning.<n>ELM operates by matching distributions defined by the model's own introspective classification capabilities.<n>We demonstrate ELM's efficacy on biosecurity, cybersecurity, and literary domain erasure tasks.
arXiv Detail & Related papers (2024-10-03T17:59:30Z)
A Novel Neural-symbolic System under Statistical Relational Learning [47.30190559449236]
We propose a neural-symbolic framework based on statistical relational learning, referred to as NSF-SRL.<n>Results of symbolic reasoning are utilized to refine and correct the predictions made by deep learning models, while deep learning models enhance the efficiency of the symbolic reasoning process.<n>We believe that this approach sets a new standard for neural-symbolic systems and will drive future research in the field of general artificial intelligence.
arXiv Detail & Related papers (2023-09-16T09:15:37Z)
Neuro-Symbolic Artificial Intelligence (AI) for Intent based Semantic Communication [85.06664206117088]
6G networks must consider semantics and effectiveness (at end-user) of the data transmission. NeSy AI is proposed as a pillar for learning causal structure behind the observed data. GFlowNet is leveraged for the first time in a wireless system to learn the probabilistic structure which generates the data.
arXiv Detail & Related papers (2022-05-22T07:11:57Z)
Great Truths are Always Simple: A Rather Simple Knowledge Encoder for Enhancing the Commonsense Reasoning Capacity of Pre-Trained Models [89.98762327725112]
Commonsense reasoning in natural language is a desired ability of artificial intelligent systems. For solving complex commonsense reasoning tasks, a typical solution is to enhance pre-trained language models(PTMs) with a knowledge-aware graph neural network(GNN) encoder. Despite the effectiveness, these approaches are built on heavy architectures, and can't clearly explain how external knowledge resources improve the reasoning capacity of PTMs.
arXiv Detail & Related papers (2022-05-04T01:27:36Z)
WenLan 2.0: Make AI Imagine via a Multimodal Foundation Model [74.4875156387271]
We develop a novel foundation model pre-trained with huge multimodal (visual and textual) data. We show that state-of-the-art results can be obtained on a wide range of downstream tasks.
arXiv Detail & Related papers (2021-10-27T12:25:21Z)
Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions. We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.