A Theoretical Framework for Inference Learning
- URL: http://arxiv.org/abs/2206.00164v1
- Date: Wed, 1 Jun 2022 00:38:55 GMT
- Title: A Theoretical Framework for Inference Learning
- Authors: Nick Alonso, Beren Millidge, Jeff Krichmar, Emre Neftci
- Abstract summary: Backpropagation (BP) is the most successful and widely used algorithm in deep learning.
Inference learning algorithm (IL) has close connections to neurobiological models of cortical function.
IL has equal performance to BP on supervised learning and auto-associative tasks.
- Score: 1.433758865948252
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Backpropagation (BP) is the most successful and widely used algorithm in deep
learning. However, the computations required by BP are challenging to reconcile
with known neurobiology. This difficulty has stimulated interest in more
biologically plausible alternatives to BP. One such algorithm is the inference
learning algorithm (IL). IL has close connections to neurobiological models of
cortical function and has achieved equal performance to BP on supervised
learning and auto-associative tasks. In contrast to BP, however, the
mathematical foundations of IL are not well-understood. Here, we develop a
novel theoretical framework for IL. Our main result is that IL closely
approximates an optimization method known as implicit stochastic gradient
descent (implicit SGD), which is distinct from the explicit SGD implemented by
BP. Our results further show how the standard implementation of IL can be
altered to better approximate implicit SGD. Our novel implementation
considerably improves the stability of IL across learning rates, which is
consistent with our theory, as a key property of implicit SGD is its stability.
We provide extensive simulation results that further support our theoretical
interpretations and also demonstrate IL achieves quicker convergence when
trained with small mini-batches while matching the performance of BP for large
mini-batches.
Related papers
- Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation [70.43845294145714]
Relieving the reliance of neural network training on a global back-propagation (BP) has emerged as a notable research topic.
We propose a local training strategy that successively regularizes the gradient reconciliation between neighboring modules.
Our method can be integrated into both local-BP and BP-free settings.
arXiv Detail & Related papers (2024-06-07T19:10:31Z) - Understanding and Improving Optimization in Predictive Coding Networks [1.6114012813668934]
inference learning algorithm (IL) is a promising, bio-plausible alternative to Backpropagation (BP)
IL is computationally demanding, and without memory-intensives like Adam, IL may converge to poor local minima.
IL can reduce loss more quickly than BP, but the reasons for these speedups or their robustness remains unclear.
arXiv Detail & Related papers (2023-05-23T00:32:26Z) - The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - Asymptotically Unbiased Instance-wise Regularized Partial AUC
Optimization: Theory and Algorithm [101.44676036551537]
One-way Partial AUC (OPAUC) and Two-way Partial AUC (TPAUC) measures the average performance of a binary classifier.
Most of the existing methods could only optimize PAUC approximately, leading to inevitable biases that are not controllable.
We present a simpler reformulation of the PAUC problem via distributional robust optimization AUC.
arXiv Detail & Related papers (2022-10-08T08:26:22Z) - Towards Scaling Difference Target Propagation by Learning Backprop
Targets [64.90165892557776]
Difference Target Propagation is a biologically-plausible learning algorithm with close relation with Gauss-Newton (GN) optimization.
We propose a novel feedback weight training scheme that ensures both that DTP approximates BP and that layer-wise feedback weight training can be restored.
We report the best performance ever achieved by DTP on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2022-01-31T18:20:43Z) - A Theoretical View of Linear Backpropagation and Its Convergence [55.69505060636719]
Backpropagation (BP) is widely used for calculating gradients in deep neural networks (DNNs)
Recently, a linear variant of BP named LinBP was introduced for generating more transferable adversarial examples for performing black-box attacks.
We provide theoretical analyses on LinBP in neural-network-involved learning tasks, including adversarial attack and model training.
arXiv Detail & Related papers (2021-12-21T07:18:00Z) - Predictive Coding Can Do Exact Backpropagation on Any Neural Network [40.51949948934705]
We generalize (IL and) Z-IL by directly defining them on computational graphs.
This is the first biologically plausible algorithm that is shown to be equivalent to BP in the way of updating parameters on any neural network.
arXiv Detail & Related papers (2021-03-08T11:52:51Z) - Predictive Coding Can Do Exact Backpropagation on Convolutional and
Recurrent Neural Networks [40.51949948934705]
Predictive coding networks (PCNs) are an influential model for information processing in the brain.
BP is commonly regarded to be the most successful learning method in modern machine learning.
We show that a biologically plausible algorithm is able to exactly replicate the accuracy of BP on complex architectures.
arXiv Detail & Related papers (2021-03-05T14:57:01Z) - A Theoretical Framework for Target Propagation [75.52598682467817]
We analyze target propagation (TP), a popular but not yet fully understood alternative to backpropagation (BP)
Our theory shows that TP is closely related to Gauss-Newton optimization and thus substantially differs from BP.
We provide a first solution to this problem through a novel reconstruction loss that improves feedback weight training.
arXiv Detail & Related papers (2020-06-25T12:07:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.