Related papers: Sequence-to-Sequence Models with Attention Mechanistically Map to the Architecture of Human Memory Search

Sequence-to-Sequence Models with Attention Mechanistically Map to the Architecture of Human Memory Search

URL: http://arxiv.org/abs/2506.17424v1
Date: Fri, 20 Jun 2025 18:43:15 GMT
Title: Sequence-to-Sequence Models with Attention Mechanistically Map to the Architecture of Human Memory Search
Authors: Nikolaus Salvatore, Qiong Zhang,
Abstract summary: We show that foundational architectures in neural machine translation exhibit mechanisms that directly correspond to those specified in the Context Maintenance and Retrieval model of human memory.<n>We implement a neural machine translation model as a cognitive model of human memory search that is both interpretable and capable of capturing complex dynamics of learning.
Score: 13.961239165301315
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Past work has long recognized the important role of context in guiding how humans search their memory. While context-based memory models can explain many memory phenomena, it remains unclear why humans develop such architectures over possible alternatives in the first place. In this work, we demonstrate that foundational architectures in neural machine translation -- specifically, recurrent neural network (RNN)-based sequence-to-sequence models with attention -- exhibit mechanisms that directly correspond to those specified in the Context Maintenance and Retrieval (CMR) model of human memory. Since neural machine translation models have evolved to optimize task performance, their convergence with human memory models provides a deeper understanding of the functional role of context in human memory, as well as presenting new ways to model human memory. Leveraging this convergence, we implement a neural machine translation model as a cognitive model of human memory search that is both interpretable and capable of capturing complex dynamics of learning. We show that our model accounts for both averaged and optimal human behavioral patterns as effectively as context-based memory models. Further, we demonstrate additional strengths of the proposed model by evaluating how memory search performance emerges from the interaction of different model components.

Related papers

A Neural Network Model of Complementary Learning Systems: Pattern Separation and Completion for Continual Learning [2.9123921488295768]
Learning new information without forgetting prior knowledge is central to human intelligence.<n>In contrast, neural network models suffer from catastrophic forgetting when acquiring new information.<n>We develop a neurally plausible continual learning model that achieves close to state-of-the-art accuracy (90%)<n>Our work provides a functional template for modeling memory consolidation, generalization, and continual learning in both biological and artificial systems.
arXiv Detail & Related papers (2025-07-15T15:05:26Z)
If Attention Serves as a Cognitive Model of Human Memory Retrieval, What is the Plausible Memory Representation? [3.757103053174534]
We investigate whether the attention mechanism of Transformer Grammar (TG) can serve as a cognitive model of human memory retrieval.<n>Our experiments demonstrate that TG's attention achieves superior predictive power for self-paced reading times compared to vanilla Transformer's.
arXiv Detail & Related papers (2025-02-17T05:58:25Z)
Evolvable Psychology Informed Neural Network for Memory Behavior Modeling [2.5258264040936305]
This paper proposes a theory informed neural networks for memory behavior modeling named PsyINN. It constructs a framework that combines neural network with differentiating sparse regression, achieving joint optimization. On four large-scale real-world memory behavior datasets, the proposed method surpasses the state-of-the-art methods in prediction accuracy.
arXiv Detail & Related papers (2024-08-23T01:35:32Z)
Causal Estimation of Memorisation Profiles [58.20086589761273]
Understanding memorisation in language models has practical and societal implications. Memorisation is the causal effect of training with an instance on the model's ability to predict that instance. This paper proposes a new, principled, and efficient method to estimate memorisation based on the difference-in-differences design from econometrics.
arXiv Detail & Related papers (2024-06-06T17:59:09Z)
Linking In-context Learning in Transformers to Human Episodic Memory [1.124958340749622]
We focus on induction heads, which contribute to in-context learning in Transformer-based large language models. We demonstrate that induction heads are behaviorally, functionally, and mechanistically similar to the contextual maintenance and retrieval model of human episodic memory.
arXiv Detail & Related papers (2024-05-23T18:51:47Z)
Demolition and Reinforcement of Memories in Spin-Glass-like Neural Networks [0.0]
The aim of this thesis is to understand the effectiveness of Unlearning in both associative memory models and generative models. The selection of structured data enables an associative memory model to retrieve concepts as attractors of a neural dynamics with considerable basins of attraction. A novel regularization technique for Boltzmann Machines is presented, proving to outperform previously developed methods in learning hidden probability distributions from data-sets.
arXiv Detail & Related papers (2024-03-04T23:12:42Z)
On the Relationship Between Variational Inference and Auto-Associative Memory [68.8204255655161]
We study how different neural network approaches to variational inference can be applied in this framework. We evaluate the obtained algorithms on the CIFAR10 and CLEVR image datasets and compare them with other associative memory models.
arXiv Detail & Related papers (2022-10-14T14:18:47Z)
CogNGen: Constructing the Kernel of a Hyperdimensional Predictive Processing Cognitive Architecture [79.07468367923619]
We present a new cognitive architecture that combines two neurobiologically plausible, computational models. We aim to develop a cognitive architecture that has the power of modern machine learning techniques.
arXiv Detail & Related papers (2022-03-31T04:44:28Z)
Universal Hopfield Networks: A General Framework for Single-Shot Associative Memory Models [41.58529335439799]
We propose a general framework for understanding the operation of memory networks as a sequence of three operations. We derive all these memory models as instances of our general framework with differing similarity and separation functions.
arXiv Detail & Related papers (2022-02-09T16:48:06Z)
Hierarchical Variational Memory for Few-shot Learning Across Domains [120.87679627651153]
We introduce a hierarchical prototype model, where each level of the prototype fetches corresponding information from the hierarchical memory. The model is endowed with the ability to flexibly rely on features at different semantic levels if the domain shift circumstances so demand. We conduct thorough ablation studies to demonstrate the effectiveness of each component in our model.
arXiv Detail & Related papers (2021-12-15T15:01:29Z)
Towards a Predictive Processing Implementation of the Common Model of Cognition [79.63867412771461]
We describe an implementation of the common model of cognition grounded in neural generative coding and holographic associative memory. The proposed system creates the groundwork for developing agents that learn continually from diverse tasks as well as model human performance at larger scales.
arXiv Detail & Related papers (2021-05-15T22:55:23Z)
The Neural Coding Framework for Learning Generative Models [91.0357317238509]
We propose a novel neural generative model inspired by the theory of predictive processing in the brain. In a similar way, artificial neurons in our generative model predict what neighboring neurons will do, and adjust their parameters based on how well the predictions matched reality.
arXiv Detail & Related papers (2020-12-07T01:20:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.