Hidden Traveling Waves bind Working Memory Variables in Recurrent Neural Networks
- URL: http://arxiv.org/abs/2402.10163v3
- Date: Mon, 8 Apr 2024 02:54:35 GMT
- Title: Hidden Traveling Waves bind Working Memory Variables in Recurrent Neural Networks
- Authors: Arjun Karuvally, Terrence J. Sejnowski, Hava T. Siegelmann,
- Abstract summary: We leverage the concept of traveling wave dynamics within a neural lattice to formulate a theoretical model of neural working memory.
We rigorously examine the model's capabilities in representing and learning state histories.
Our findings suggest the broader relevance of traveling waves in AI and its potential in advancing neural network architectures.
- Score: 3.686808512438363
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Traveling waves are a fundamental phenomenon in the brain, playing a crucial role in short-term information storage. In this study, we leverage the concept of traveling wave dynamics within a neural lattice to formulate a theoretical model of neural working memory, study its properties, and its real world implications in AI. The proposed model diverges from traditional approaches, which assume information storage in static, register-like locations updated by interference. Instead, the model stores data as waves that is updated by the wave's boundary conditions. We rigorously examine the model's capabilities in representing and learning state histories, which are vital for learning history-dependent dynamical systems. The findings reveal that the model reliably stores external information and enhances the learning process by addressing the diminishing gradient problem. To understand the model's real-world applicability, we explore two cases: linear boundary condition (LBC) and non-linear, self-attention-driven boundary condition (SBC). The model with the linear boundary condition results in a shift matrix plus low-rank matrix currently used in H3 state space RNN. Further, our experiments with LBC reveal that this matrix is effectively learned by Recurrent Neural Networks (RNNs) through backpropagation when modeling history-dependent dynamical systems. Conversely, the SBC parallels the autoregressive loop of an attention-only transformer with the context vector representing the wave substrate. Collectively, our findings suggest the broader relevance of traveling waves in AI and its potential in advancing neural network architectures.
Related papers
- Recurrent convolutional neural networks for non-adiabatic dynamics of quantum-classical systems [1.2972104025246092]
We present a RNN model based on convolutional neural networks for modeling the nonlinear non-adiabatic dynamics of hybrid quantum-classical systems.
validation studies show that the trained PARC model could reproduce the space-time evolution of a one-dimensional semi-classical Holstein model.
arXiv Detail & Related papers (2024-12-09T16:23:25Z) - A scalable generative model for dynamical system reconstruction from neuroimaging data [5.777167013394619]
Data-driven inference of the generative dynamics underlying a set of observed time series is of growing interest in machine learning.
Recent breakthroughs in training techniques for state space models (SSMs) specifically geared toward dynamical systems reconstruction (DSR) enable to recover the underlying system.
We propose a novel algorithm that solves this problem and scales exceptionally well with model dimensionality and filter length.
arXiv Detail & Related papers (2024-11-05T09:45:57Z) - Demolition and Reinforcement of Memories in Spin-Glass-like Neural
Networks [0.0]
The aim of this thesis is to understand the effectiveness of Unlearning in both associative memory models and generative models.
The selection of structured data enables an associative memory model to retrieve concepts as attractors of a neural dynamics with considerable basins of attraction.
A novel regularization technique for Boltzmann Machines is presented, proving to outperform previously developed methods in learning hidden probability distributions from data-sets.
arXiv Detail & Related papers (2024-03-04T23:12:42Z) - Physics-Informed Deep Learning of Rate-and-State Fault Friction [0.0]
We develop a multi-network PINN for both the forward problem and for direct inversion of nonlinear fault friction parameters.
We present the computational PINN framework for strike-slip faults in 1D and 2D subject to rate-and-state friction.
We find that the network for the parameter inversion at the fault performs much better than the network for material displacements to which it is coupled.
arXiv Detail & Related papers (2023-12-14T23:53:25Z) - Neural Koopman prior for data assimilation [7.875955593012905]
We use a neural network architecture to embed dynamical systems in latent spaces.
We introduce methods that enable to train such a model for long-term continuous reconstruction.
The potential for self-supervised learning is also demonstrated, as we show the promising use of trained dynamical models as priors for variational data assimilation techniques.
arXiv Detail & Related papers (2023-09-11T09:04:36Z) - Neural Abstractions [72.42530499990028]
We present a novel method for the safety verification of nonlinear dynamical models that uses neural networks to represent abstractions of their dynamics.
We demonstrate that our approach performs comparably to the mature tool Flow* on existing benchmark nonlinear models.
arXiv Detail & Related papers (2023-01-27T12:38:09Z) - An advanced spatio-temporal convolutional recurrent neural network for
storm surge predictions [73.4962254843935]
We study the capability of artificial neural network models to emulate storm surge based on the storm track/size/intensity history.
This study presents a neural network model that can predict storm surge, informed by a database of synthetic storm simulations.
arXiv Detail & Related papers (2022-04-18T23:42:18Z) - Cross-Frequency Coupling Increases Memory Capacity in Oscillatory Neural
Networks [69.42260428921436]
Cross-frequency coupling (CFC) is associated with information integration across populations of neurons.
We construct a model of CFC which predicts a computational role for observed $theta - gamma$ oscillatory circuits in the hippocampus and cortex.
We show that the presence of CFC increases the memory capacity of a population of neurons connected by plastic synapses.
arXiv Detail & Related papers (2022-04-05T17:13:36Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - An Ode to an ODE [78.97367880223254]
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the group O(d)
This nested system of two flows provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem.
arXiv Detail & Related papers (2020-06-19T22:05:19Z) - Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models.
Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems.
These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.