A Theoretical Framework for Target Propagation
- URL: http://arxiv.org/abs/2006.14331v4
- Date: Wed, 16 Dec 2020 16:21:36 GMT
- Title: A Theoretical Framework for Target Propagation
- Authors: Alexander Meulemans, Francesco S. Carzaniga, Johan A.K. Suykens,
Jo\~ao Sacramento, Benjamin F. Grewe
- Abstract summary: We analyze target propagation (TP), a popular but not yet fully understood alternative to backpropagation (BP)
Our theory shows that TP is closely related to Gauss-Newton optimization and thus substantially differs from BP.
We provide a first solution to this problem through a novel reconstruction loss that improves feedback weight training.
- Score: 75.52598682467817
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The success of deep learning, a brain-inspired form of AI, has sparked
interest in understanding how the brain could similarly learn across multiple
layers of neurons. However, the majority of biologically-plausible learning
algorithms have not yet reached the performance of backpropagation (BP), nor
are they built on strong theoretical foundations. Here, we analyze target
propagation (TP), a popular but not yet fully understood alternative to BP,
from the standpoint of mathematical optimization. Our theory shows that TP is
closely related to Gauss-Newton optimization and thus substantially differs
from BP. Furthermore, our analysis reveals a fundamental limitation of
difference target propagation (DTP), a well-known variant of TP, in the
realistic scenario of non-invertible neural networks. We provide a first
solution to this problem through a novel reconstruction loss that improves
feedback weight training, while simultaneously introducing architectural
flexibility by allowing for direct feedback connections from the output to each
hidden layer. Our theory is corroborated by experimental results that show
significant improvements in performance and in the alignment of forward weight
updates with loss gradients, compared to DTP.
Related papers
- Architectural Strategies for the optimization of Physics-Informed Neural
Networks [30.92757082348805]
Physics-informed neural networks (PINNs) offer a promising avenue for tackling both forward and inverse problems in partial differential equations (PDEs)
Despite their remarkable empirical success, PINNs have garnered a reputation for their notorious training challenges across a spectrum of PDEs.
arXiv Detail & Related papers (2024-02-05T04:15:31Z) - Layer-wise Feedback Propagation [53.00944147633484]
We present Layer-wise Feedback Propagation (LFP), a novel training approach for neural-network-like predictors.
LFP assigns rewards to individual connections based on their respective contributions to solving a given task.
We demonstrate its effectiveness in achieving comparable performance to gradient descent on various models and datasets.
arXiv Detail & Related papers (2023-08-23T10:48:28Z) - Learning with augmented target information: An alternative theory of
Feedback Alignment [0.0]
We propose a novel theory of how Feedback Alignment (FA) works through the lens of information theory.
FA learns effective representations by embedding target information into neural networks to be trained.
We show this through the analysis of FA dynamics in idealized settings and then via a series of experiments.
arXiv Detail & Related papers (2023-04-03T22:44:03Z) - A Theoretical Framework for Inference and Learning in Predictive Coding
Networks [41.58529335439799]
Predictive coding (PC) is an influential theory in computational neuroscience.
We provide a comprehensive theoretical analysis of the properties of PCNs trained with prospective configuration.
arXiv Detail & Related papers (2022-07-21T04:17:55Z) - Towards Scaling Difference Target Propagation by Learning Backprop
Targets [64.90165892557776]
Difference Target Propagation is a biologically-plausible learning algorithm with close relation with Gauss-Newton (GN) optimization.
We propose a novel feedback weight training scheme that ensures both that DTP approximates BP and that layer-wise feedback weight training can be restored.
We report the best performance ever achieved by DTP on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2022-01-31T18:20:43Z) - A Theoretical View of Linear Backpropagation and Its Convergence [55.69505060636719]
Backpropagation (BP) is widely used for calculating gradients in deep neural networks (DNNs)
Recently, a linear variant of BP named LinBP was introduced for generating more transferable adversarial examples for performing black-box attacks.
We provide theoretical analyses on LinBP in neural-network-involved learning tasks, including adversarial attack and model training.
arXiv Detail & Related papers (2021-12-21T07:18:00Z) - Target Propagation via Regularized Inversion [4.289574109162585]
We present a simple version of target propagation based on regularized inversion of network layers, easily implementable in a differentiable programming framework.
We show how our TP can be used to train recurrent neural networks with long sequences on various sequence modeling problems.
arXiv Detail & Related papers (2021-12-02T17:49:25Z) - Credit Assignment in Neural Networks through Deep Feedback Control [59.14935871979047]
Deep Feedback Control (DFC) is a new learning method that uses a feedback controller to drive a deep neural network to match a desired output target and whose control signal can be used for credit assignment.
The resulting learning rule is fully local in space and time and approximates Gauss-Newton optimization for a wide range of connectivity patterns.
To further underline its biological plausibility, we relate DFC to a multi-compartment model of cortical pyramidal neurons with a local voltage-dependent synaptic plasticity rule, consistent with recent theories of dendritic processing.
arXiv Detail & Related papers (2021-06-15T05:30:17Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.