Sequence Learning Using Equilibrium Propagation
- URL: http://arxiv.org/abs/2209.09626v4
- Date: Tue, 22 Aug 2023 01:13:51 GMT
- Title: Sequence Learning Using Equilibrium Propagation
- Authors: Malyaban Bal and Abhronil Sengupta
- Abstract summary: Equilibrium Propagation (EP) is a powerful and more bio-plausible alternative to conventional learning frameworks such as backpropagation.
We leverage recent developments in modern hopfield networks to further understand energy based models and develop solutions for complex sequence classification tasks using EP.
- Score: 2.3361887733755897
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Equilibrium Propagation (EP) is a powerful and more bio-plausible alternative
to conventional learning frameworks such as backpropagation. The effectiveness
of EP stems from the fact that it relies only on local computations and
requires solely one kind of computational unit during both of its training
phases, thereby enabling greater applicability in domains such as bio-inspired
neuromorphic computing. The dynamics of the model in EP is governed by an
energy function and the internal states of the model consequently converge to a
steady state following the state transition rules defined by the same. However,
by definition, EP requires the input to the model (a convergent RNN) to be
static in both the phases of training. Thus it is not possible to design a
model for sequence classification using EP with an LSTM or GRU like
architecture. In this paper, we leverage recent developments in modern hopfield
networks to further understand energy based models and develop solutions for
complex sequence classification tasks using EP while satisfying its convergence
criteria and maintaining its theoretical similarities with recurrent
backpropagation. We explore the possibility of integrating modern hopfield
networks as an attention mechanism with convergent RNN models used in EP,
thereby extending its applicability for the first time on two different
sequence classification tasks in natural language processing viz. sentiment
analysis (IMDB dataset) and natural language inference (SNLI dataset).
Related papers
- Latent Space Energy-based Neural ODEs [73.01344439786524]
This paper introduces a novel family of deep dynamical models designed to represent continuous-time sequence data.
We train the model using maximum likelihood estimation with Markov chain Monte Carlo.
Experiments on oscillating systems, videos and real-world state sequences (MuJoCo) illustrate that ODEs with the learnable energy-based prior outperform existing counterparts.
arXiv Detail & Related papers (2024-09-05T18:14:22Z) - A domain decomposition-based autoregressive deep learning model for unsteady and nonlinear partial differential equations [2.7755345520127936]
We propose a domain-decomposition-based deep learning (DL) framework, named CoMLSim, for accurately modeling unsteady and nonlinear partial differential equations (PDEs)
The framework consists of two key components: (a) a convolutional neural network (CNN)-based autoencoder architecture and (b) an autoregressive model composed of fully connected layers.
arXiv Detail & Related papers (2024-08-26T17:50:47Z) - Synthetic location trajectory generation using categorical diffusion
models [50.809683239937584]
Diffusion models (DPMs) have rapidly evolved to be one of the predominant generative models for the simulation of synthetic data.
We propose using DPMs for the generation of synthetic individual location trajectories (ILTs) which are sequences of variables representing physical locations visited by individuals.
arXiv Detail & Related papers (2024-02-19T15:57:39Z) - Recurrent neural networks and transfer learning for elasto-plasticity in
woven composites [0.0]
This article presents Recurrent Neural Network (RNN) models as a surrogate for computationally intensive meso-scale simulation of woven composites.
A mean-field model generates a comprehensive data set representing elasto-plastic behavior.
In simulations, arbitrary six-dimensional strain histories are used to predict stresses under random walking as the source task and cyclic loading conditions as the target task.
arXiv Detail & Related papers (2023-11-22T14:47:54Z) - Point-Based Value Iteration for POMDPs with Neural Perception Mechanisms [31.51588071503617]
We introduce neuro-symbolic partially observable Markov decision processes (NS-POMDPs)
We propose a novel piecewise linear and convex representation (P-PWLC) in terms of polyhedra covering the state space and value vectors.
We show the practical applicability of our approach on two case studies that employ (trained) ReLU neural networks as perception functions.
arXiv Detail & Related papers (2023-06-30T13:26:08Z) - ETLP: Event-based Three-factor Local Plasticity for online learning with
neuromorphic hardware [105.54048699217668]
We show a competitive performance in accuracy with a clear advantage in the computational complexity for Event-Based Three-factor Local Plasticity (ETLP)
We also show that when using local plasticity, threshold adaptation in spiking neurons and a recurrent topology are necessary to learntemporal patterns with a rich temporal structure.
arXiv Detail & Related papers (2023-01-19T19:45:42Z) - Generalized Neural Closure Models with Interpretability [28.269731698116257]
We develop a novel and versatile methodology of unified neural partial delay differential equations.
We augment existing/low-fidelity dynamical models directly in their partial differential equation (PDE) forms with both Markovian and non-Markovian neural network (NN) closure parameterizations.
We demonstrate the new generalized neural closure models (gnCMs) framework using four sets of experiments based on advecting nonlinear waves, shocks, and ocean acidification models.
arXiv Detail & Related papers (2023-01-15T21:57:43Z) - Distributed Bayesian Learning of Dynamic States [65.7870637855531]
The proposed algorithm is a distributed Bayesian filtering task for finite-state hidden Markov models.
It can be used for sequential state estimation, as well as for modeling opinion formation over social networks under dynamic environments.
arXiv Detail & Related papers (2022-12-05T19:40:17Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Continual Weight Updates and Convolutional Architectures for Equilibrium
Propagation [69.87491240509485]
Equilibrium Propagation (EP) is a biologically inspired alternative algorithm to backpropagation (BP) for training neural networks.
We propose a discrete-time formulation of EP which enables to simplify equations, speed up training and extend EP to CNNs.
Our CNN model achieves the best performance ever reported on MNIST with EP.
arXiv Detail & Related papers (2020-04-29T12:14:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.