Related papers: Sequence Learning Using Equilibrium Propagation

Sequence Learning Using Equilibrium Propagation

URL: http://arxiv.org/abs/2209.09626v4
Date: Tue, 22 Aug 2023 01:13:51 GMT
Title: Sequence Learning Using Equilibrium Propagation
Authors: Malyaban Bal and Abhronil Sengupta
Abstract summary: Equilibrium Propagation (EP) is a powerful and more bio-plausible alternative to conventional learning frameworks such as backpropagation. We leverage recent developments in modern hopfield networks to further understand energy based models and develop solutions for complex sequence classification tasks using EP.
Score: 2.3361887733755897
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Equilibrium Propagation (EP) is a powerful and more bio-plausible alternative to conventional learning frameworks such as backpropagation. The effectiveness of EP stems from the fact that it relies only on local computations and requires solely one kind of computational unit during both of its training phases, thereby enabling greater applicability in domains such as bio-inspired neuromorphic computing. The dynamics of the model in EP is governed by an energy function and the internal states of the model consequently converge to a steady state following the state transition rules defined by the same. However, by definition, EP requires the input to the model (a convergent RNN) to be static in both the phases of training. Thus it is not possible to design a model for sequence classification using EP with an LSTM or GRU like architecture. In this paper, we leverage recent developments in modern hopfield networks to further understand energy based models and develop solutions for complex sequence classification tasks using EP while satisfying its convergence criteria and maintaining its theoretical similarities with recurrent backpropagation. We explore the possibility of integrating modern hopfield networks as an attention mechanism with convergent RNN models used in EP, thereby extending its applicability for the first time on two different sequence classification tasks in natural language processing viz. sentiment analysis (IMDB dataset) and natural language inference (SNLI dataset).

Related papers

Sequential-Parallel Duality in Prefix Scannable Models [68.39855814099997]
Recent developments have given rise to various models, such as Gated Linear Attention (GLA) and Mamba.<n>This raises a natural question: can we characterize the full class of neural sequence models that support near-constant-time parallel evaluation and linear-time, constant-space sequential inference?
arXiv Detail & Related papers (2025-06-12T17:32:02Z)
Brain-like variational inference [0.0]
Inference in both brains and machines can be formalized by maximizing the evidence lower bound (ELBO) in machine learning, or minimizing variational free energy (F) in neuroscience (ELBO = -F)<n>Here, we show that online natural gradient descent on F, under Poisson assumptions, leads to a recurrent spiking neural network that performs variational inference via membrane potential dynamics.<n>The resulting model -- the iterative Poisson variational autoencoder (iP-VAE) -- replaces the encoder network with local updates derived from natural gradient descent on F.
arXiv Detail & Related papers (2024-10-25T06:00:18Z)
Latent Space Energy-based Neural ODEs [73.01344439786524]
This paper introduces a novel family of deep dynamical models designed to represent continuous-time sequence data. We train the model using maximum likelihood estimation with Markov chain Monte Carlo. Experiments on oscillating systems, videos and real-world state sequences (MuJoCo) illustrate that ODEs with the learnable energy-based prior outperform existing counterparts.
arXiv Detail & Related papers (2024-09-05T18:14:22Z)
A domain decomposition-based autoregressive deep learning model for unsteady and nonlinear partial differential equations [2.7755345520127936]
We propose a domain-decomposition-based deep learning (DL) framework, named CoMLSim, for accurately modeling unsteady and nonlinear partial differential equations (PDEs) The framework consists of two key components: (a) a convolutional neural network (CNN)-based autoencoder architecture and (b) an autoregressive model composed of fully connected layers.
arXiv Detail & Related papers (2024-08-26T17:50:47Z)
Synthetic location trajectory generation using categorical diffusion models [50.809683239937584]
Diffusion models (DPMs) have rapidly evolved to be one of the predominant generative models for the simulation of synthetic data. We propose using DPMs for the generation of synthetic individual location trajectories (ILTs) which are sequences of variables representing physical locations visited by individuals.
arXiv Detail & Related papers (2024-02-19T15:57:39Z)
Recurrent neural networks and transfer learning for elasto-plasticity in woven composites [0.0]
This article presents Recurrent Neural Network (RNN) models as a surrogate for computationally intensive meso-scale simulation of woven composites. A mean-field model generates a comprehensive data set representing elasto-plastic behavior. In simulations, arbitrary six-dimensional strain histories are used to predict stresses under random walking as the source task and cyclic loading conditions as the target task.
arXiv Detail & Related papers (2023-11-22T14:47:54Z)
Point-Based Value Iteration for POMDPs with Neural Perception Mechanisms [31.51588071503617]
We introduce neuro-symbolic partially observable Markov decision processes (NS-POMDPs) We propose a novel piecewise linear and convex representation (P-PWLC) in terms of polyhedra covering the state space and value vectors. We show the practical applicability of our approach on two case studies that employ (trained) ReLU neural networks as perception functions.
arXiv Detail & Related papers (2023-06-30T13:26:08Z)
ETLP: Event-based Three-factor Local Plasticity for online learning with neuromorphic hardware [105.54048699217668]
We show a competitive performance in accuracy with a clear advantage in the computational complexity for Event-Based Three-factor Local Plasticity (ETLP) We also show that when using local plasticity, threshold adaptation in spiking neurons and a recurrent topology are necessary to learntemporal patterns with a rich temporal structure.
arXiv Detail & Related papers (2023-01-19T19:45:42Z)
Generalized Neural Closure Models with Interpretability [28.269731698116257]
We develop a novel and versatile methodology of unified neural partial delay differential equations. We augment existing/low-fidelity dynamical models directly in their partial differential equation (PDE) forms with both Markovian and non-Markovian neural network (NN) closure parameterizations. We demonstrate the new generalized neural closure models (gnCMs) framework using four sets of experiments based on advecting nonlinear waves, shocks, and ocean acidification models.
arXiv Detail & Related papers (2023-01-15T21:57:43Z)
Distributed Bayesian Learning of Dynamic States [65.7870637855531]
The proposed algorithm is a distributed Bayesian filtering task for finite-state hidden Markov models. It can be used for sequential state estimation, as well as for modeling opinion formation over social networks under dynamic environments.
arXiv Detail & Related papers (2022-12-05T19:40:17Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
Continual Weight Updates and Convolutional Architectures for Equilibrium Propagation [69.87491240509485]
Equilibrium Propagation (EP) is a biologically inspired alternative algorithm to backpropagation (BP) for training neural networks. We propose a discrete-time formulation of EP which enables to simplify equations, speed up training and extend EP to CNNs. Our CNN model achieves the best performance ever reported on MNIST with EP.
arXiv Detail & Related papers (2020-04-29T12:14:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.