Scaling Equilibrium Propagation to Deep ConvNets by Drastically Reducing
its Gradient Estimator Bias
- URL: http://arxiv.org/abs/2101.05536v1
- Date: Thu, 14 Jan 2021 10:23:40 GMT
- Title: Scaling Equilibrium Propagation to Deep ConvNets by Drastically Reducing
its Gradient Estimator Bias
- Authors: Axel Laborieux, Maxence Ernoult, Benjamin Scellier, Yoshua Bengio,
Julie Grollier and Damien Querlioz
- Abstract summary: In practice, EP does not scale to visual tasks harder than MNIST.
We show that a bias in the gradient estimate of EP, inherent in the use of finite nudging, is responsible for this phenomenon.
These results highlight EP as a scalable approach to compute error gradients in deep neural networks, thereby motivating its hardware implementation.
- Score: 62.43908463620527
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Equilibrium Propagation (EP) is a biologically-inspired counterpart of
Backpropagation Through Time (BPTT) which, owing to its strong theoretical
guarantees and the locality in space of its learning rule, fosters the design
of energy-efficient hardware dedicated to learning. In practice, however, EP
does not scale to visual tasks harder than MNIST. In this work, we show that a
bias in the gradient estimate of EP, inherent in the use of finite nudging, is
responsible for this phenomenon and that cancelling it allows training deep
ConvNets by EP, including architectures with distinct forward and backward
connections. These results highlight EP as a scalable approach to compute error
gradients in deep neural networks, thereby motivating its hardware
implementation.
Related papers
- Efficient Training of Deep Neural Operator Networks via Randomized Sampling [0.0]
Deep operator network (DeepNet) has demonstrated success in the real-time prediction of complex dynamics across various scientific and engineering applications.
We introduce a random sampling technique to be adopted the training of DeepONet, aimed at improving generalization ability of the model, while significantly reducing computational time.
Our results indicate that incorporating randomization in the trunk network inputs during training enhances the efficiency and robustness of DeepONet, offering a promising avenue for improving the framework's performance in modeling complex physical systems.
arXiv Detail & Related papers (2024-09-20T07:18:31Z) - Layer-wise Feedback Propagation [53.00944147633484]
We present Layer-wise Feedback Propagation (LFP), a novel training approach for neural-network-like predictors.
LFP assigns rewards to individual connections based on their respective contributions to solving a given task.
We demonstrate its effectiveness in achieving comparable performance to gradient descent on various models and datasets.
arXiv Detail & Related papers (2023-08-23T10:48:28Z) - A Theoretical Framework for Inference and Learning in Predictive Coding
Networks [41.58529335439799]
Predictive coding (PC) is an influential theory in computational neuroscience.
We provide a comprehensive theoretical analysis of the properties of PCNs trained with prospective configuration.
arXiv Detail & Related papers (2022-07-21T04:17:55Z) - Towards Scaling Difference Target Propagation by Learning Backprop
Targets [64.90165892557776]
Difference Target Propagation is a biologically-plausible learning algorithm with close relation with Gauss-Newton (GN) optimization.
We propose a novel feedback weight training scheme that ensures both that DTP approximates BP and that layer-wise feedback weight training can be restored.
We report the best performance ever achieved by DTP on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2022-01-31T18:20:43Z) - A Theoretical Framework for Target Propagation [75.52598682467817]
We analyze target propagation (TP), a popular but not yet fully understood alternative to backpropagation (BP)
Our theory shows that TP is closely related to Gauss-Newton optimization and thus substantially differs from BP.
We provide a first solution to this problem through a novel reconstruction loss that improves feedback weight training.
arXiv Detail & Related papers (2020-06-25T12:07:06Z) - Scaling Equilibrium Propagation to Deep ConvNets by Drastically Reducing
its Gradient Estimator Bias [65.13042449121411]
In practice, training a network with the gradient estimates provided by EP does not scale to visual tasks harder than MNIST.
We show that a bias in the gradient estimate of EP, inherent in the use of finite nudging, is responsible for this phenomenon.
We apply these techniques to train an architecture with asymmetric forward and backward connections, yielding a 13.2% test error.
arXiv Detail & Related papers (2020-06-06T09:36:07Z) - Continual Weight Updates and Convolutional Architectures for Equilibrium
Propagation [69.87491240509485]
Equilibrium Propagation (EP) is a biologically inspired alternative algorithm to backpropagation (BP) for training neural networks.
We propose a discrete-time formulation of EP which enables to simplify equations, speed up training and extend EP to CNNs.
Our CNN model achieves the best performance ever reported on MNIST with EP.
arXiv Detail & Related papers (2020-04-29T12:14:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.