Emergence of the Primacy Effect in Structured State-Space Models
- URL: http://arxiv.org/abs/2502.13729v5
- Date: Mon, 08 Sep 2025 14:59:54 GMT
- Title: Emergence of the Primacy Effect in Structured State-Space Models
- Authors: Takashi Morita,
- Abstract summary: Structured state-space models (SSMs) have been developed to offer more persistent memory retention than traditional recurrent neural networks.<n>The memory mechanism of canonical SSMs is theoretically designed to decay monotonically over time.<n>The present study reveals a counterintuitive finding: when trained and evaluated on a synthetic, statistically balanced memorization task, SSMs predominantly preserve the *initially* presented data in memory.
- Score: 0.35534933448684125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Structured state-space models (SSMs) have been developed to offer more persistent memory retention than traditional recurrent neural networks, while maintaining real-time inference capabilities and addressing the time-complexity limitations of Transformers. Despite this intended persistence, the memory mechanism of canonical SSMs is theoretically designed to decay monotonically over time, meaning that more recent inputs are expected to be retained more accurately than earlier ones. Contrary to this theoretical expectation, however, the present study reveals a counterintuitive finding: when trained and evaluated on a synthetic, statistically balanced memorization task, SSMs predominantly preserve the *initially* presented data in memory. This pattern of memory bias, known as the *primacy effect* in psychology, presents a non-trivial challenge to the current theoretical understanding of SSMs and opens new avenues for future research.
Related papers
- Memory Determines Learning Direction: A Theory of Gradient-Based Optimization in State Space Models [2.6599014990168834]
State space models (SSMs) have gained attention by showing potential to outperform Transformers.<n>In this study, we provide such an explanation and propose an improved training strategy.
arXiv Detail & Related papers (2025-10-01T06:30:42Z) - Learning to Dissipate Energy in Oscillatory State-Space Models [51.98491034847041]
State-space models (SSMs) are a class of networks for sequence learning.<n>We show that D-LinOSS consistently outperforms previous LinOSS methods on long-range learning tasks.
arXiv Detail & Related papers (2025-05-17T23:15:17Z) - Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing [56.66469232740998]
We show that Structured State Space Models (SSMs) are inherently limited by strong recency bias.<n>This bias impairs the models' ability to recall distant information and introduces robustness issues.<n>We propose to polarize two channels of the state transition matrices in SSMs, setting them to zero and one, respectively, simultaneously addressing recency bias and over-smoothing.
arXiv Detail & Related papers (2024-12-31T22:06:39Z) - Deep reinforcement learning with time-scale invariant memory [1.338174941551702]
We integrate a computational neuroscience model of scale invariant memory into deep reinforcement learning (RL) agents.<n>We show that such agents can learn robustly across a wide range of temporal scales.<n>This result illustrates that incorporating computational principles from neuroscience and cognitive science into deep neural networks can enhance adaptability to complex temporal dynamics.
arXiv Detail & Related papers (2024-12-19T07:20:03Z) - Artificial Kuramoto Oscillatory Neurons [65.16453738828672]
It has long been known in both neuroscience and AI that ''binding'' between neurons leads to a form of competitive learning.<n>We introduce Artificial rethinking together with arbitrary connectivity designs such as fully connected convolutional, or attentive mechanisms.<n>We show that this idea provides performance improvements across a wide spectrum of tasks such as unsupervised object discovery, adversarial robustness, uncertainty, and reasoning.
arXiv Detail & Related papers (2024-10-17T17:47:54Z) - Mathematical Formalism for Memory Compression in Selective State Space Models [0.0]
State space models (SSMs) have emerged as a powerful framework for modelling long-range dependencies in sequence data.
We develop a rigorous mathematical framework for understanding memory compression in selective state space models.
We show that selective SSMs offer significant improvements in memory efficiency and processing speed compared to traditional RNN-based models.
arXiv Detail & Related papers (2024-10-04T05:45:48Z) - Neural Dynamics Model of Visual Decision-Making: Learning from Human Experts [28.340344705437758]
We implement a comprehensive visual decision-making model that spans from visual input to behavioral output.
Our model aligns closely with human behavior and reflects neural activities in primates.
A neuroimaging-informed fine-tuning approach was introduced and applied to the model, leading to performance improvements.
arXiv Detail & Related papers (2024-09-04T02:38:52Z) - Brain-inspired Computational Modeling of Action Recognition with Recurrent Spiking Neural Networks Equipped with Reinforcement Delay Learning [4.9798155883849935]
Action recognition has received significant attention due to its intricate nature and the brain's exceptional performance in this area.
Current solutions for action recognition either exhibit limitations in effectively addressing the problem or lack the necessary biological plausibility.
This article presents an effective brain-inspired computational model for action recognition.
arXiv Detail & Related papers (2024-06-17T17:34:16Z) - The Impact of Geometric Complexity on Neural Collapse in Transfer Learning [6.554326244334867]
Flatness of the loss surface and neural collapse have recently emerged as useful pre-training metrics.<n>We show through experiments and theory that mechanisms which affect the geometric complexity of the pre-trained network also influence the neural collapse.
arXiv Detail & Related papers (2024-05-24T16:52:09Z) - Neuro-mimetic Task-free Unsupervised Online Learning with Continual
Self-Organizing Maps [56.827895559823126]
Self-organizing map (SOM) is a neural model often used in clustering and dimensionality reduction.
We propose a generalization of the SOM, the continual SOM, which is capable of online unsupervised learning under a low memory budget.
Our results, on benchmarks including MNIST, Kuzushiji-MNIST, and Fashion-MNIST, show almost a two times increase in accuracy.
arXiv Detail & Related papers (2024-02-19T19:11:22Z) - A Neuro-mimetic Realization of the Common Model of Cognition via Hebbian
Learning and Free Energy Minimization [55.11642177631929]
Large neural generative models are capable of synthesizing semantically rich passages of text or producing complex images.
We discuss the COGnitive Neural GENerative system, such an architecture that casts the Common Model of Cognition.
arXiv Detail & Related papers (2023-10-14T23:28:48Z) - Long Short-term Memory with Two-Compartment Spiking Neuron [64.02161577259426]
We propose a novel biologically inspired Long Short-Term Memory Leaky Integrate-and-Fire spiking neuron model, dubbed LSTM-LIF.
Our experimental results, on a diverse range of temporal classification tasks, demonstrate superior temporal classification capability, rapid training convergence, strong network generalizability, and high energy efficiency of the proposed LSTM-LIF model.
This work, therefore, opens up a myriad of opportunities for resolving challenging temporal processing tasks on emerging neuromorphic computing machines.
arXiv Detail & Related papers (2023-07-14T08:51:03Z) - Sequential Memory with Temporal Predictive Coding [6.228559238589584]
We propose a PC-based model for emphsequential memory, called emphtemporal predictive coding (tPC)
We show that our tPC models can memorize and retrieve sequential inputs accurately with a biologically plausible neural implementation.
arXiv Detail & Related papers (2023-05-19T20:03:31Z) - Contrastive-Signal-Dependent Plasticity: Self-Supervised Learning in Spiking Neural Circuits [61.94533459151743]
This work addresses the challenge of designing neurobiologically-motivated schemes for adjusting the synapses of spiking networks.
Our experimental simulations demonstrate a consistent advantage over other biologically-plausible approaches when training recurrent spiking networks.
arXiv Detail & Related papers (2023-03-30T02:40:28Z) - Plasticity Neural Network Based on Astrocytic Influence at Critical
Periods, Synaptic Competition and Compensation by Current and Mnemonic Brain
Plasticity and Synapse Formation [7.8787868286474]
Based on the RNN frame, we accomplished the model construction, formula derivation and algorithm testing for PNN.
The question we proposed is whether the promotion of neuroscience and brain cognition was achieved by model construction, formula derivation or algorithm testing.
arXiv Detail & Related papers (2022-03-19T14:38:54Z) - Reducing Catastrophic Forgetting in Self Organizing Maps with
Internally-Induced Generative Replay [67.50637511633212]
A lifelong learning agent is able to continually learn from potentially infinite streams of pattern sensory data.
One major historic difficulty in building agents that adapt is that neural systems struggle to retain previously-acquired knowledge when learning from new samples.
This problem is known as catastrophic forgetting (interference) and remains an unsolved problem in the domain of machine learning to this day.
arXiv Detail & Related papers (2021-12-09T07:11:14Z) - Mapping and Validating a Point Neuron Model on Intel's Neuromorphic
Hardware Loihi [77.34726150561087]
We investigate the potential of Intel's fifth generation neuromorphic chip - Loihi'
Loihi is based on the novel idea of Spiking Neural Networks (SNNs) emulating the neurons in the brain.
We find that Loihi replicates classical simulations very efficiently and scales notably well in terms of both time and energy performance as the networks get larger.
arXiv Detail & Related papers (2021-09-22T16:52:51Z) - Towards a Predictive Processing Implementation of the Common Model of
Cognition [79.63867412771461]
We describe an implementation of the common model of cognition grounded in neural generative coding and holographic associative memory.
The proposed system creates the groundwork for developing agents that learn continually from diverse tasks as well as model human performance at larger scales.
arXiv Detail & Related papers (2021-05-15T22:55:23Z) - Object-based attention for spatio-temporal reasoning: Outperforming
neuro-symbolic models with flexible distributed architectures [15.946511512356878]
We show that a fully-learned neural network with the right inductive biases can perform substantially better than all previous neural-symbolic models.
Our model makes critical use of both self-attention and learned "soft" object-centric representations.
arXiv Detail & Related papers (2020-12-15T18:57:40Z) - The Neural Coding Framework for Learning Generative Models [91.0357317238509]
We propose a novel neural generative model inspired by the theory of predictive processing in the brain.
In a similar way, artificial neurons in our generative model predict what neighboring neurons will do, and adjust their parameters based on how well the predictions matched reality.
arXiv Detail & Related papers (2020-12-07T01:20:38Z) - Fooling the primate brain with minimal, targeted image manipulation [67.78919304747498]
We propose an array of methods for creating minimal, targeted image perturbations that lead to changes in both neuronal activity and perception as reflected in behavior.
Our work shares the same goal with adversarial attack, namely the manipulation of images with minimal, targeted noise that leads ANN models to misclassify the images.
arXiv Detail & Related papers (2020-11-11T08:30:54Z) - Supporting Optimal Phase Space Reconstructions Using Neural Network
Architecture for Time Series Modeling [68.8204255655161]
We propose an artificial neural network with a mechanism to implicitly learn the phase spaces properties.
Our approach is either as competitive as or better than most state-of-the-art strategies.
arXiv Detail & Related papers (2020-06-19T21:04:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.