Activation Relaxation: A Local Dynamical Approximation to
Backpropagation in the Brain
- URL: http://arxiv.org/abs/2009.05359v5
- Date: Sat, 10 Oct 2020 14:16:15 GMT
- Title: Activation Relaxation: A Local Dynamical Approximation to
Backpropagation in the Brain
- Authors: Beren Millidge, Alexander Tschantz, Anil K Seth, Christopher L Buckley
- Abstract summary: Activation Relaxation (AR) is motivated by constructing the backpropagation gradient as the equilibrium point of a dynamical system.
Our algorithm converges rapidly and robustly to the correct backpropagation gradients, requires only a single type of computational unit, and can operate on arbitrary computation graphs.
- Score: 62.997667081978825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The backpropagation of error algorithm (backprop) has been instrumental in
the recent success of deep learning. However, a key question remains as to
whether backprop can be formulated in a manner suitable for implementation in
neural circuitry. The primary challenge is to ensure that any candidate
formulation uses only local information, rather than relying on global signals
as in standard backprop. Recently several algorithms for approximating backprop
using only local signals have been proposed. However, these algorithms
typically impose other requirements which challenge biological plausibility:
for example, requiring complex and precise connectivity schemes, or multiple
sequential backwards phases with information being stored across phases. Here,
we propose a novel algorithm, Activation Relaxation (AR), which is motivated by
constructing the backpropagation gradient as the equilibrium point of a
dynamical system. Our algorithm converges rapidly and robustly to the correct
backpropagation gradients, requires only a single type of computational unit,
utilises only a single parallel backwards relaxation phase, and can operate on
arbitrary computation graphs. We illustrate these properties by training deep
neural networks on visual classification tasks, and describe simplifications to
the algorithm which remove further obstacles to neurobiological implementation
(for example, the weight-transport problem, and the use of nonlinear
derivatives), while preserving performance.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Precise asymptotics of reweighted least-squares algorithms for linear diagonal networks [15.074950361970194]
We provide a unified analysis for a family of algorithms that encompasses IRLS, the recently proposed linlin-RFM algorithm, and the alternating diagonal neural networks.
We show that, with appropriately chosen reweighting policy, a handful of sparse structures can achieve favorable performance.
We also show that leveraging this in the reweighting scheme provably improves test error compared to coordinate-wise reweighting.
arXiv Detail & Related papers (2024-06-04T20:37:17Z) - The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - Revisiting Recursive Least Squares for Training Deep Neural Networks [10.44340837533087]
Recursive least squares (RLS) algorithms were once widely used for training small-scale neural networks, due to their fast convergence.
Previous RLS algorithms are unsuitable for training deep neural networks (DNNs), since they have high computational complexity and too many preconditions.
We propose three novel RLS optimization algorithms for training feedforward neural networks, convolutional neural networks and recurrent neural networks.
arXiv Detail & Related papers (2021-09-07T17:43:51Z) - Convolutional Sparse Coding Fast Approximation with Application to
Seismic Reflectivity Estimation [9.005280130480308]
We propose a speed-up upgraded version of the classic iterative thresholding algorithm, that produces a good approximation of the convolutional sparse code within 2-5 iterations.
The performance of the proposed solution is demonstrated via the seismic inversion problem in both synthetic and real data scenarios.
arXiv Detail & Related papers (2021-06-29T12:19:07Z) - Investigating the Scalability and Biological Plausibility of the
Activation Relaxation Algorithm [62.997667081978825]
Activation Relaxation (AR) algorithm provides a simple and robust approach for approximating the backpropagation of error algorithm.
We show that the algorithm can be further simplified and made more biologically plausible by introducing a learnable set of backwards weights.
We also investigate whether another biologically implausible assumption of the original AR algorithm -- the frozen feedforward pass -- can be relaxed without damaging performance.
arXiv Detail & Related papers (2020-10-13T08:02:38Z) - Accelerated Message Passing for Entropy-Regularized MAP Inference [89.15658822319928]
Maximum a posteriori (MAP) inference in discrete-valued random fields is a fundamental problem in machine learning.
Due to the difficulty of this problem, linear programming (LP) relaxations are commonly used to derive specialized message passing algorithms.
We present randomized methods for accelerating these algorithms by leveraging techniques that underlie classical accelerated gradient.
arXiv Detail & Related papers (2020-07-01T18:43:32Z) - Predictive Coding Approximates Backprop along Arbitrary Computation
Graphs [68.8204255655161]
We develop a strategy to translate core machine learning architectures into their predictive coding equivalents.
Our models perform equivalently to backprop on challenging machine learning benchmarks.
Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry.
arXiv Detail & Related papers (2020-06-07T15:35:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.