Related papers: MapFormer: Self-Supervised Learning of Cognitive Maps with Input-Dependent Positional Embeddings

MapFormer: Self-Supervised Learning of Cognitive Maps with Input-Dependent Positional Embeddings

URL: http://arxiv.org/abs/2511.19279v1
Date: Mon, 24 Nov 2025 16:29:02 GMT
Title: MapFormer: Self-Supervised Learning of Cognitive Maps with Input-Dependent Positional Embeddings
Authors: Victor Rambaud, Salvador Mascarenhas, Yair Lakretz,
Abstract summary: A cognitive map is an internal model which encodes the abstract relationships among entities in the world.<n>We introduce MapFormers, new architectures based on Transformer models, which can learn cognitive maps from observational data.<n>MapFormers have broad applications in both neuroscience and AI, by explaining the neural mechanisms giving rise to cognitive maps.
Score: 5.647131476818603
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A cognitive map is an internal model which encodes the abstract relationships among entities in the world, giving humans and animals the flexibility to adapt to new situations, with a strong out-of-distribution (OOD) generalization that current AI systems still do not possess. To bridge this gap, we introduce MapFormers, new architectures based on Transformer models, which can learn cognitive maps from observational data and perform path integration in parallel, in a self-supervised manner. Cognitive maps are learned in the model by disentangling structural relationships in the inputs from their specific content, a property that can be achieved naturally by updating the positional encoding in Transformers with input-dependent matrices. We developed two variants of MapFormers that unify absolute and relative positional encoding to model episodic (EM) and working memory (WM), respectively. We tested MapFormers on several tasks, including a classic 2D navigation task, showing that our models can learn a cognitive map of the underlying space and generalize OOD (e.g., to longer sequences) with near-perfect performance, unlike current architectures. Together, these results demonstrate the superiority of models designed to learn a cognitive map, and the importance of introducing a structural bias for structure-content disentanglement, which can be achieved in Transformers with input-dependent positional encoding. MapFormers have broad applications in both neuroscience and AI, by explaining the neural mechanisms giving rise to cognitive maps, while allowing these relation models to be learned at scale.

Related papers

Cognitive Maps in Language Models: A Mechanistic Analysis of Spatial Planning [2.1115884707107715]
We train GPT-2 models on three spatial learning paradigms in grid environments.<n>Using behavioural, representational and mechanistic analyses, we uncover two fundamentally different learned algorithms.
arXiv Detail & Related papers (2025-11-17T13:46:19Z)
A Biologically Interpretable Cognitive Architecture for Online Structuring of Episodic Memories into Cognitive Maps [0.0]
We propose a novel cognitive architecture for structuring episodic memories into cognitive maps.<n>Our model integrates the Successor Features framework with episodic memories, enabling incremental, online learning.<n>This work bridges computational neuroscience and AI, offering a biologically grounded approach to cognitive map formation in artificial adaptive agents.
arXiv Detail & Related papers (2025-09-29T04:07:38Z)
Localizing Knowledge in Diffusion Transformers [44.27817967554535]
We propose a model- and knowledge-agnostic method to localize where specific types of knowledge are encoded within the Diffusion Transformer blocks.<n>We show that the identified blocks are both interpretable and causally linked to the expression of knowledge in generated outputs.<n>Our findings offer new insights into the internal structure of DiTs and introduce a practical pathway for more interpretable, efficient, and controllable model editing.
arXiv Detail & Related papers (2025-05-24T19:02:20Z)
Scaling Laws and Representation Learning in Simple Hierarchical Languages: Transformers vs. Convolutional Architectures [49.19753720526998]
We derive theoretical scaling laws for neural network performance on synthetic datasets.<n>We validate that convolutional networks, whose structure aligns with that of the generative process through locality and weight sharing, enjoy a faster scaling of performance.<n>This finding clarifies the architectural biases underlying neural scaling laws and highlights how representation learning is shaped by the interaction between model architecture and the statistical properties of data.
arXiv Detail & Related papers (2025-05-11T17:44:14Z)
MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities [72.05167902805405]
We present MergeNet, which learns to bridge the gap of parameter spaces of heterogeneous models.<n>The core mechanism of MergeNet lies in the parameter adapter, which operates by querying the source model's low-rank parameters.<n> MergeNet is learned alongside both models, allowing our framework to dynamically transfer and adapt knowledge relevant to the current stage.
arXiv Detail & Related papers (2024-04-20T08:34:39Z)
Multi-Modal Cognitive Maps based on Neural Networks trained on Successor Representations [3.4916237834391874]
Cognitive maps are a proposed concept on how the brain efficiently organizes memories and retrieves context out of them. We set up a multi-modal neural network using successor representations which is able to model place cell dynamics and cognitive map representations. The network learns the similarities between novel inputs and the training database and therefore the representation of the cognitive map successfully.
arXiv Detail & Related papers (2023-12-22T12:44:15Z)
Conceptual Cognitive Maps Formation with Neural Successor Networks and Word Embeddings [7.909848251752742]
We introduce a model that employs successor representations and neural networks, along with word embedding, to construct a cognitive map of three separate concepts. The network adeptly learns two different scaled maps and situates new information in proximity to related pre-existing representations. We suggest that our model could potentially improve current AI models by providing multi-modal context information to any input.
arXiv Detail & Related papers (2023-07-04T09:11:01Z)
Neurosymbolic hybrid approach to driver collision warning [64.02492460600905]
There are two main algorithmic approaches to autonomous driving systems. Deep learning alone has achieved state-of-the-art results in many areas. But sometimes it can be very difficult to debug if the deep learning model doesn't work.
arXiv Detail & Related papers (2022-03-28T20:29:50Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Towards a Predictive Processing Implementation of the Common Model of Cognition [79.63867412771461]
We describe an implementation of the common model of cognition grounded in neural generative coding and holographic associative memory. The proposed system creates the groundwork for developing agents that learn continually from diverse tasks as well as model human performance at larger scales.
arXiv Detail & Related papers (2021-05-15T22:55:23Z)
S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures. We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.