Neural Network Training with Asymmetric Crosspoint Elements
- URL: http://arxiv.org/abs/2201.13377v1
- Date: Mon, 31 Jan 2022 17:41:36 GMT
- Title: Neural Network Training with Asymmetric Crosspoint Elements
- Authors: Murat Onen, Tayfun Gokmen, Teodor K. Todorov, Tomasz Nowicki, Jesus A.
del Alamo, John Rozen, Wilfried Haensch, Seyoung Kim
- Abstract summary: asymmetric conductance modulation of practical resistive devices critically degrades the classification of networks trained with conventional algorithms.
Here, we describe and experimentally demonstrate an alternative fully-parallel training algorithm: Hamiltonian Descent.
We provide critical intuition on why device asymmetry is fundamentally incompatible with conventional training algorithms and how the new approach exploits it as a useful feature instead.
- Score: 1.0773924713784704
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Analog crossbar arrays comprising programmable nonvolatile resistors are
under intense investigation for acceleration of deep neural network training.
However, the ubiquitous asymmetric conductance modulation of practical
resistive devices critically degrades the classification performance of
networks trained with conventional algorithms. Here, we describe and
experimentally demonstrate an alternative fully-parallel training algorithm:
Stochastic Hamiltonian Descent. Instead of conventionally tuning weights in the
direction of the error function gradient, this method programs the network
parameters to successfully minimize the total energy (Hamiltonian) of the
system that incorporates the effects of device asymmetry. We provide critical
intuition on why device asymmetry is fundamentally incompatible with
conventional training algorithms and how the new approach exploits it as a
useful feature instead. Our technique enables immediate realization of analog
deep learning accelerators based on readily available device technologies.
Related papers
- Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs)
Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators.
Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z) - Enriched Physics-informed Neural Networks for Dynamic
Poisson-Nernst-Planck Systems [0.8192907805418583]
This paper proposes a meshless deep learning algorithm, enriched physics-informed neural networks (EPINNs) to solve dynamic Poisson-Nernst-Planck (PNP) equations.
The EPINNs takes the traditional physics-informed neural networks as the foundation framework, and adds the adaptive loss weight to balance the loss functions.
Numerical results indicate that the new method has better applicability than traditional numerical methods in solving such coupled nonlinear systems.
arXiv Detail & Related papers (2024-02-01T02:57:07Z) - Gradient-descent hardware-aware training and deployment for mixed-signal
Neuromorphic processors [2.812395851874055]
Mixed-signal neuromorphic processors provide extremely low-power operation for edge inference workloads.
We demonstrate a novel methodology for ofDine training and deployment of spiking neural networks (SNNs) to the mixed-signal neuromorphic processor DYNAP-SE2.
arXiv Detail & Related papers (2023-03-14T08:56:54Z) - Quantization-aware Interval Bound Propagation for Training Certifiably
Robust Quantized Neural Networks [58.195261590442406]
We study the problem of training and certifying adversarially robust quantized neural networks (QNNs)
Recent work has shown that floating-point neural networks that have been verified to be robust can become vulnerable to adversarial attacks after quantization.
We present quantization-aware interval bound propagation (QA-IBP), a novel method for training robust QNNs.
arXiv Detail & Related papers (2022-11-29T13:32:38Z) - Simple initialization and parametrization of sinusoidal networks via
their kernel bandwidth [92.25666446274188]
sinusoidal neural networks with activations have been proposed as an alternative to networks with traditional activation functions.
We first propose a simplified version of such sinusoidal neural networks, which allows both for easier practical implementation and simpler theoretical analysis.
We then analyze the behavior of these networks from the neural tangent kernel perspective and demonstrate that their kernel approximates a low-pass filter with an adjustable bandwidth.
arXiv Detail & Related papers (2022-11-26T07:41:48Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - An error-propagation spiking neural network compatible with neuromorphic
processors [2.432141667343098]
We present a spike-based learning method that approximates back-propagation using local weight update mechanisms.
We introduce a network architecture that enables synaptic weight update mechanisms to back-propagate error signals.
This work represents a first step towards the design of ultra-low power mixed-signal neuromorphic processing systems.
arXiv Detail & Related papers (2021-04-12T07:21:08Z) - Supervised training of spiking neural networks for robust deployment on
mixed-signal neuromorphic processors [2.6949002029513167]
Mixed-signal analog/digital electronic circuits can emulate spiking neurons and synapses with extremely high energy efficiency.
Mismatch is expressed as differences in effective parameters between identically-configured neurons and synapses.
We present a supervised learning approach that addresses this challenge by maximizing robustness to mismatch and other common sources of noise.
arXiv Detail & Related papers (2021-02-12T09:20:49Z) - Relaxing the Constraints on Predictive Coding Models [62.997667081978825]
Predictive coding is an influential theory of cortical function which posits that the principal computation the brain performs is the minimization of prediction errors.
Standard implementations of the algorithm still involve potentially neurally implausible features such as identical forward and backward weights, backward nonlinear derivatives, and 1-1 error unit connectivity.
In this paper, we show that these features are not integral to the algorithm and can be removed either directly or through learning additional sets of parameters with Hebbian update rules without noticeable harm to learning performance.
arXiv Detail & Related papers (2020-10-02T15:21:37Z) - Training End-to-End Analog Neural Networks with Equilibrium Propagation [64.0476282000118]
We introduce a principled method to train end-to-end analog neural networks by gradient descent.
We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models.
Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.
arXiv Detail & Related papers (2020-06-02T23:38:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.