Convergence and Alignment of Gradient Descentwith Random Back
propagation Weights
- URL: http://arxiv.org/abs/2106.06044v1
- Date: Thu, 10 Jun 2021 20:58:05 GMT
- Title: Convergence and Alignment of Gradient Descentwith Random Back
propagation Weights
- Authors: Ganlin Song, Ruitu Xu, John Lafferty
- Abstract summary: gradient descent with backpropagation is the workhorse of artificial neural networks.
Lillicrap et al. propose a more biologically plausible "feedback alignment" algorithm that uses random and fixed backpropagation weights.
- Score: 6.338178373376447
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stochastic gradient descent with backpropagation is the workhorse of
artificial neural networks. It has long been recognized that backpropagation
fails to be a biologically plausible algorithm. Fundamentally, it is a
non-local procedure -- updating one neuron's synaptic weights requires
knowledge of synaptic weights or receptive fields of downstream neurons. This
limits the use of artificial neural networks as a tool for understanding the
biological principles of information processing in the brain. Lillicrap et al.
(2016) propose a more biologically plausible "feedback alignment" algorithm
that uses random and fixed backpropagation weights, and show promising
simulations. In this paper we study the mathematical properties of the feedback
alignment procedure by analyzing convergence and alignment for two-layer
networks under squared error loss. In the overparameterized setting, we prove
that the error converges to zero exponentially fast, and also that
regularization is necessary in order for the parameters to become aligned with
the random backpropagation weights. Simulations are given that are consistent
with this analysis and suggest further generalizations. These results
contribute to our understanding of how biologically plausible algorithms might
carry out weight learning in a manner different from Hebbian learning, with
performance that is comparable with the full non-local backpropagation
algorithm.
Related papers
- SGD method for entropy error function with smoothing l0 regularization for neural networks [3.108634881604788]
entropy error function has been widely used in neural networks.
We propose a novel entropy function with smoothing l0 regularization for feed-forward neural networks.
Our work is novel as it enables neural networks to learn effectively, producing more accurate predictions.
arXiv Detail & Related papers (2024-05-28T19:54:26Z) - Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise.
We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z) - Refining neural network predictions using background knowledge [68.35246878394702]
We show we can use logical background knowledge in learning system to compensate for a lack of labeled training data.
We introduce differentiable refinement functions that find a corrected prediction close to the original prediction.
This algorithm finds optimal refinements on complex SAT formulas in significantly fewer iterations and frequently finds solutions where gradient descent can not.
arXiv Detail & Related papers (2022-06-10T10:17:59Z) - Accelerating Understanding of Scientific Experiments with End to End
Symbolic Regression [12.008215939224382]
We develop a deep neural network to address the problem of learning free-form symbolic expressions from raw data.
We train our neural network on a synthetic dataset consisting of data tables of varying length and varying levels of noise.
We validate our technique by running on a public dataset from behavioral science.
arXiv Detail & Related papers (2021-12-07T22:28:53Z) - A Normative and Biologically Plausible Algorithm for Independent
Component Analysis [15.082715993594121]
In signal processing, linear blind source separation problems are often solved by Independent Component Analysis (ICA)
To serve as a model of a biological circuit, the ICA neural network (NN) must satisfy at least the following requirements.
We propose a novel objective function for ICA from which we derive a biologically plausible NN, including both the neural architecture and the synaptic learning rules.
arXiv Detail & Related papers (2021-11-17T01:43:42Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Activation Relaxation: A Local Dynamical Approximation to
Backpropagation in the Brain [62.997667081978825]
Activation Relaxation (AR) is motivated by constructing the backpropagation gradient as the equilibrium point of a dynamical system.
Our algorithm converges rapidly and robustly to the correct backpropagation gradients, requires only a single type of computational unit, and can operate on arbitrary computation graphs.
arXiv Detail & Related papers (2020-09-11T11:56:34Z) - Learning compositional functions via multiplicative weight updates [97.9457834009578]
We show that multiplicative weight updates satisfy a descent lemma tailored to compositional functions.
We show that Madam can train state of the art neural network architectures without learning rate tuning.
arXiv Detail & Related papers (2020-06-25T17:05:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.