ZORB: A Derivative-Free Backpropagation Algorithm for Neural Networks
- URL: http://arxiv.org/abs/2011.08895v1
- Date: Tue, 17 Nov 2020 19:29:47 GMT
- Title: ZORB: A Derivative-Free Backpropagation Algorithm for Neural Networks
- Authors: Varun Ranganathan, Alex Lewandowski
- Abstract summary: We present a simple yet faster training algorithm called Zeroth-Order Relaxed Backpropagation (ZORB)
Instead of calculating gradients, ZORB uses the pseudoinverse of targets to backpropagate information.
Experiments on standard classification and regression benchmarks demonstrate ZORB's advantage over traditional backpropagation with Gradient Descent.
- Score: 3.6562366216810447
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gradient descent and backpropagation have enabled neural networks to achieve
remarkable results in many real-world applications. Despite ongoing success,
training a neural network with gradient descent can be a slow and strenuous
affair. We present a simple yet faster training algorithm called Zeroth-Order
Relaxed Backpropagation (ZORB). Instead of calculating gradients, ZORB uses the
pseudoinverse of targets to backpropagate information. ZORB is designed to
reduce the time required to train deep neural networks without penalizing
performance. To illustrate the speed up, we trained a feed-forward neural
network with 11 layers on MNIST and observed that ZORB converged 300 times
faster than Adam while achieving a comparable error rate, without any
hyperparameter tuning. We also broaden the scope of ZORB to convolutional
neural networks, and apply it to subsamples of the CIFAR-10 dataset.
Experiments on standard classification and regression benchmarks demonstrate
ZORB's advantage over traditional backpropagation with Gradient Descent.
Related papers
- Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Desire Backpropagation: A Lightweight Training Algorithm for Multi-Layer
Spiking Neural Networks based on Spike-Timing-Dependent Plasticity [13.384228628766236]
Spiking neural networks (SNNs) are a viable alternative to conventional artificial neural networks.
We present desire backpropagation, a method to derive the desired spike activity of all neurons, including the hidden ones.
We trained three-layer networks to classify MNIST and Fashion-MNIST images and reached an accuracy of 98.41% and 87.56%, respectively.
arXiv Detail & Related papers (2022-11-10T08:32:13Z) - Online Training Through Time for Spiking Neural Networks [66.7744060103562]
Spiking neural networks (SNNs) are promising brain-inspired energy-efficient models.
Recent progress in training methods has enabled successful deep SNNs on large-scale tasks with low latency.
We propose online training through time (OTTT) for SNNs, which is derived from BPTT to enable forward-in-time learning.
arXiv Detail & Related papers (2022-10-09T07:47:56Z) - Training High-Performance Low-Latency Spiking Neural Networks by
Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware.
It is a challenge to efficiently train SNNs due to their non-differentiability.
We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z) - Navigating Local Minima in Quantized Spiking Neural Networks [3.1351527202068445]
Spiking and Quantized Neural Networks (NNs) are becoming exceedingly important for hyper-efficient implementations of Deep Learning (DL) algorithms.
These networks face challenges when trained using error backpropagation, due to the absence of gradient signals when applying hard thresholds.
This paper presents a systematic evaluation of a cosine-annealed LR schedule coupled with weight-independent adaptive moment estimation.
arXiv Detail & Related papers (2022-02-15T06:42:25Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - Revisiting Batch Normalization for Training Low-latency Deep Spiking
Neural Networks from Scratch [5.511606249429581]
Spiking Neural Networks (SNNs) have emerged as an alternative to deep learning.
High-accuracy and low-latency SNNs from scratch suffer from non-differentiable nature of a spiking neuron.
We propose a temporal Batch Normalization Through Time (BNTT) technique for training temporal SNNs.
arXiv Detail & Related papers (2020-10-05T00:49:30Z) - A Hybrid Method for Training Convolutional Neural Networks [3.172761915061083]
We propose a hybrid method that uses both backpropagation and evolutionary strategies to train Convolutional Neural Networks.
We show that the proposed hybrid method is capable of improving upon regular training in the task of image classification.
arXiv Detail & Related papers (2020-04-15T17:52:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.