Error-driven Input Modulation: Solving the Credit Assignment Problem
without a Backward Pass
- URL: http://arxiv.org/abs/2201.11665v3
- Date: Sun, 4 Jun 2023 17:35:03 GMT
- Title: Error-driven Input Modulation: Solving the Credit Assignment Problem
without a Backward Pass
- Authors: Giorgia Dellaferrera, Gabriel Kreiman
- Abstract summary: Supervised learning in artificial neural networks typically relies on backpropagation.
We show that this approach lacks biological plausibility in many regards.
We propose to replace the backward pass with a second forward pass in which the input signal is modulated based on the error of the network.
- Score: 8.39059551023011
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Supervised learning in artificial neural networks typically relies on
backpropagation, where the weights are updated based on the error-function
gradients and sequentially propagated from the output layer to the input layer.
Although this approach has proven effective in a wide domain of applications,
it lacks biological plausibility in many regards, including the weight symmetry
problem, the dependence of learning on non-local signals, the freezing of
neural activity during error propagation, and the update locking problem.
Alternative training schemes have been introduced, including sign symmetry,
feedback alignment, and direct feedback alignment, but they invariably rely on
a backward pass that hinders the possibility of solving all the issues
simultaneously. Here, we propose to replace the backward pass with a second
forward pass in which the input signal is modulated based on the error of the
network. We show that this novel learning rule comprehensively addresses all
the above-mentioned issues and can be applied to both fully connected and
convolutional models. We test this learning rule on MNIST, CIFAR-10, and
CIFAR-100. These results help incorporate biological principles into machine
learning.
Related papers
- Prime and Modulate Learning: Generation of forward models with signed
back-propagation and environmental cues [0.0]
Deep neural networks employing error back-propagation for learning can suffer from exploding and vanishing gradient problems.
In this work we follow a different approach where back-propagation makes exclusive use of the sign of the error signal to prime the learning.
We present a mathematical derivation of the learning rule in z-space and demonstrate the real-time performance with a robotic platform.
arXiv Detail & Related papers (2023-09-07T16:34:30Z) - Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise.
We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z) - Magnitude Matters: Fixing SIGNSGD Through Magnitude-Aware Sparsification
in the Presence of Data Heterogeneity [60.791736094073]
Communication overhead has become one of the major bottlenecks in the distributed training of deep neural networks.
We propose a magnitude-driven sparsification scheme, which addresses the non-convergence issue of SIGNSGD.
The proposed scheme is validated through experiments on Fashion-MNIST, CIFAR-10, and CIFAR-100 datasets.
arXiv Detail & Related papers (2023-02-19T17:42:35Z) - The least-control principle for learning at equilibrium [65.2998274413952]
We present a new principle for learning equilibrium recurrent neural networks, deep equilibrium models, or meta-learning.
Our results shed light on how the brain might learn and offer new ways of approaching a broad class of machine learning problems.
arXiv Detail & Related papers (2022-07-04T11:27:08Z) - Sign and Relevance Learning [0.0]
Standard models of biologically realistic reinforcement learning employ a global error signal, which implies the use of shallow networks.
In this study, we introduce a novel network that solves this problem by propagating only the sign of the plasticity change.
Neuromodulation can be understood as a rectified error or relevance signal, while the top-down sign of the error signal determines whether long-term depression will occur.
arXiv Detail & Related papers (2021-10-14T11:57:57Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - Gradient-trained Weights in Wide Neural Networks Align Layerwise to
Error-scaled Input Correlations [11.176824373696324]
We derive the layerwise weight dynamics of infinite-width neural networks with nonlinear activations trained by gradient descent.
We formulate backpropagation-free learning rules, named Align-zero and Align-ada, that theoretically achieve the same alignment as backpropagation.
arXiv Detail & Related papers (2021-06-15T21:56:38Z) - Align, then memorise: the dynamics of learning with feedback alignment [12.587037358391418]
Direct Feedback Alignment (DFA) is an efficient alternative to the ubiquitous backpropagation algorithm for training deep neural networks.
DFA successfully trains state-of-the-art models such as Transformers, but it notoriously fails to train convolutional networks.
Here, we propose a theory for the success of DFA.
arXiv Detail & Related papers (2020-11-24T22:21:27Z) - Biological credit assignment through dynamic inversion of feedforward
networks [11.345796608258434]
We show that feedforward network transformations can be effectively inverted through dynamics.
We map these dynamics onto generic feedforward networks, and show that the resulting algorithm performs well on supervised and unsupervised datasets.
arXiv Detail & Related papers (2020-07-10T00:03:01Z) - Feature Purification: How Adversarial Training Performs Robust Deep
Learning [66.05472746340142]
We show a principle that we call Feature Purification, where we show one of the causes of the existence of adversarial examples is the accumulation of certain small dense mixtures in the hidden weights during the training process of a neural network.
We present both experiments on the CIFAR-10 dataset to illustrate this principle, and a theoretical result proving that for certain natural classification tasks, training a two-layer neural network with ReLU activation using randomly gradient descent indeed this principle.
arXiv Detail & Related papers (2020-05-20T16:56:08Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.