Training End-to-End Analog Neural Networks with Equilibrium Propagation
- URL: http://arxiv.org/abs/2006.01981v2
- Date: Tue, 9 Jun 2020 22:26:05 GMT
- Title: Training End-to-End Analog Neural Networks with Equilibrium Propagation
- Authors: Jack Kendall, Ross Pantone, Kalpana Manickavasagam, Yoshua Bengio,
Benjamin Scellier
- Abstract summary: We introduce a principled method to train end-to-end analog neural networks by gradient descent.
We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models.
Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.
- Score: 64.0476282000118
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a principled method to train end-to-end analog neural networks
by stochastic gradient descent. In these analog neural networks, the weights to
be adjusted are implemented by the conductances of programmable resistive
devices such as memristors [Chua, 1971], and the nonlinear transfer functions
(or `activation functions') are implemented by nonlinear components such as
diodes. We show mathematically that a class of analog neural networks (called
nonlinear resistive networks) are energy-based models: they possess an energy
function as a consequence of Kirchhoff's laws governing electrical circuits.
This property enables us to train them using the Equilibrium Propagation
framework [Scellier and Bengio, 2017]. Our update rule for each conductance,
which is local and relies solely on the voltage drop across the corresponding
resistor, is shown to compute the gradient of the loss function. Our numerical
simulations, which use the SPICE-based Spectre simulation framework to simulate
the dynamics of electrical circuits, demonstrate training on the MNIST
classification task, performing comparably or better than equivalent-size
software-based neural networks. Our work can guide the development of a new
generation of ultra-fast, compact and low-power neural networks supporting
on-chip learning.
Related papers
- Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs)
Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators.
Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z) - Speed Limits for Deep Learning [67.69149326107103]
Recent advancement in thermodynamics allows bounding the speed at which one can go from the initial weight distribution to the final distribution of the fully trained network.
We provide analytical expressions for these speed limits for linear and linearizable neural networks.
Remarkably, given some plausible scaling assumptions on the NTK spectra and spectral decomposition of the labels -- learning is optimal in a scaling sense.
arXiv Detail & Related papers (2023-07-27T06:59:46Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - Gradient-based Neuromorphic Learning on Dynamical RRAM Arrays [3.5969667977870796]
We present MEMprop, the adoption of gradient-based learning to train fully memristive spiking neural networks (MSNNs)
Our approach harnesses intrinsic device dynamics to trigger naturally arising voltage spikes.
We obtain highly competitive accuracy amongst previously reported lightweight dense fully MSNNs on several benchmarks.
arXiv Detail & Related papers (2022-06-26T23:13:34Z) - Pretraining Graph Neural Networks for few-shot Analog Circuit Modeling
and Design [68.1682448368636]
We present a supervised pretraining approach to learn circuit representations that can be adapted to new unseen topologies or unseen prediction tasks.
To cope with the variable topological structure of different circuits we describe each circuit as a graph and use graph neural networks (GNNs) to learn node embeddings.
We show that pretraining GNNs on prediction of output node voltages can encourage learning representations that can be adapted to new unseen topologies or prediction of new circuit level properties.
arXiv Detail & Related papers (2022-03-29T21:18:47Z) - Neural net modeling of equilibria in NSTX-U [0.0]
We develop two neural networks relevant to equilibrium and shape control modeling.
Networks include Eqnet, a free-boundary equilibrium solver trained on the EFIT01 reconstruction algorithm, and Pertnet, which is trained on the Gspert code.
We report strong performance for both networks indicating that these models could reliably be used within closed-loop simulations.
arXiv Detail & Related papers (2022-02-28T16:09:58Z) - Neural Network Training with Asymmetric Crosspoint Elements [1.0773924713784704]
asymmetric conductance modulation of practical resistive devices critically degrades the classification of networks trained with conventional algorithms.
Here, we describe and experimentally demonstrate an alternative fully-parallel training algorithm: Hamiltonian Descent.
We provide critical intuition on why device asymmetry is fundamentally incompatible with conventional training algorithms and how the new approach exploits it as a useful feature instead.
arXiv Detail & Related papers (2022-01-31T17:41:36Z) - A Gradient Estimator for Time-Varying Electrical Networks with
Non-Linear Dissipation [0.0]
We use electrical circuit theory to construct a Lagrangian capable of describing deep, directed neural networks.
We derive an estimator for the gradient of the physical parameters of the network, such as synapse conductances.
We conclude by suggesting methods for extending these results to networks of biologically plausible neurons.
arXiv Detail & Related papers (2021-03-09T02:07:39Z) - Implementing efficient balanced networks with mixed-signal spike-based
learning circuits [2.1640200483378953]
Efficient Balanced Networks (EBNs) are networks of spiking neurons in which excitatory and inhibitory synaptic currents are balanced on a short timescale.
We develop a novel local learning rule suitable for on-chip implementation that drives a randomly connected network of spiking neurons into a tightly balanced regime.
Thanks to their coding properties and sparse activity, neuromorphic electronic EBNs will be ideally suited for extreme-edge computing applications.
arXiv Detail & Related papers (2020-10-27T15:05:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.