Exploration Through Introspection: A Self-Aware Reward Model
- URL: http://arxiv.org/abs/2601.03389v1
- Date: Tue, 06 Jan 2026 19:53:33 GMT
- Title: Exploration Through Introspection: A Self-Aware Reward Model
- Authors: Michael Petrowski, Milica Gašić,
- Abstract summary: Evidence points to a unified system for self- and other-awareness.<n>We explore this self-awareness by having reinforcement learning agents infer their own internal states in gridworld environments.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding how artificial agents model internal mental states is central to advancing Theory of Mind in AI. Evidence points to a unified system for self- and other-awareness. We explore this self-awareness by having reinforcement learning agents infer their own internal states in gridworld environments. Specifically, we introduce an introspective exploration component that is inspired by biological pain as a learning signal by utilizing a hidden Markov model to infer "pain-belief" from online observations. This signal is integrated into a subjective reward function to study how self-awareness affects the agent's learning abilities. Further, we use this computational framework to investigate the difference in performance between normal and chronic pain perception models. Results show that introspective agents in general significantly outperform standard baseline agents and can replicate complex human-like behaviors.
Related papers
- Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration? [83.13508919229939]
Theory of Space is defined as an agent's ability to actively acquire information through self-directed, active exploration.<n>A key innovation is spatial belief probing, which prompts models to reveal their internal spatial representations at each step.<n>Our findings suggest that current foundation models struggle to maintain coherent, revisable spatial beliefs during active exploration.
arXiv Detail & Related papers (2026-02-04T19:06:40Z) - Analyzing Advanced AI Systems Against Definitions of Life and Consciousness [0.0]
We propose a number of metrics for examining whether an advanced AI system has gained consciousness.<n>We suggest that sufficiently advanced architectures exhibiting immune like sabotage defenses, mirror self-recognition analogs, or meta-cognitive updates may cross key thresholds akin to life-like or consciousness-like traits.
arXiv Detail & Related papers (2025-02-07T15:27:34Z) - Probing for Consciousness in Machines [3.196204482566275]
This study explores the potential for artificial agents to develop core consciousness.
The emergence of core consciousness relies on the integration of a self model, informed by representations of emotions and feelings, and a world model.
Our results demonstrate that the agent can form rudimentary world and self models, suggesting a pathway toward developing machine consciousness.
arXiv Detail & Related papers (2024-11-25T10:27:07Z) - Kolb-Based Experiential Learning for Generalist Agents with Human-Level Kaggle Data Science Performance [81.05882480184587]
We propose a computational framework of Kolb's learning cycle with Vygotsky's ZPD for autonomous agents.<n>Agent K is the 1st AI system to successfully integrate Kolb- and Vygotsky-inspired human cognitive learning.<n>With 9 gold, 8 silver, and 12 bronze medals level performance - including 4 gold and 4 silver on prize-awarding competitions - Agent K is the 1st AI system to successfully integrate Kolb- and Vygotsky-inspired human cognitive learning.
arXiv Detail & Related papers (2024-11-05T23:55:23Z) - Incremental procedural and sensorimotor learning in cognitive humanoid
robots [52.77024349608834]
This work presents a cognitive agent that can learn procedures incrementally.
We show the cognitive functions required in each substage and how adding new functions helps address tasks previously unsolved by the agent.
Results show that this approach is capable of solving complex tasks incrementally.
arXiv Detail & Related papers (2023-04-30T22:51:31Z) - Self-Emotion-Mediated Exploration in Artificial Intelligence Mirrors: Findings from Cognitive Psychology [0.08739101659113153]
This study proposes a learning framework for artificial agents to obtain an intrinsic exploratory drive.<n>Data analysis scores dictate pride or surprise, in accordance with psychological studies on humans.<n>Results: Causal relationships between states and exploration are demonstrated by the majority of agents.
arXiv Detail & Related papers (2023-02-13T18:20:44Z) - From internal models toward metacognitive AI [0.0]
In the prefrontal cortex, a distributed executive network called the "cognitive reality monitoring network" orchestrates conscious involvement of generative-inverse model pairs.
A high responsibility signal is given to the pairs that best capture the external world.
consciousness is determined by the entropy of responsibility signals across all pairs.
arXiv Detail & Related papers (2021-09-27T05:00:56Z) - Backprop-Free Reinforcement Learning with Active Neural Generative
Coding [84.11376568625353]
We propose a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments.
We develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference.
The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.
arXiv Detail & Related papers (2021-07-10T19:02:27Z) - AGENT: A Benchmark for Core Psychological Reasoning [60.35621718321559]
Intuitive psychology is the ability to reason about hidden mental variables that drive observable actions.
Despite recent interest in machine agents that reason about other agents, it is not clear if such agents learn or hold the core psychology principles that drive human reasoning.
We present a benchmark consisting of procedurally generated 3D animations, AGENT, structured around four scenarios.
arXiv Detail & Related papers (2021-02-24T14:58:23Z) - Machine Common Sense [77.34726150561087]
Machine common sense remains a broad, potentially unbounded problem in artificial intelligence (AI)
This article deals with the aspects of modeling commonsense reasoning focusing on such domain as interpersonal interactions.
arXiv Detail & Related papers (2020-06-15T13:59:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.