Related papers: Predictive Coding Approximates Backprop along Arbitrary Computation Graphs

Predictive Coding Approximates Backprop along Arbitrary Computation Graphs

URL: http://arxiv.org/abs/2006.04182v5
Date: Mon, 5 Oct 2020 18:11:05 GMT
Title: Predictive Coding Approximates Backprop along Arbitrary Computation Graphs
Authors: Beren Millidge, Alexander Tschantz, Christopher L. Buckley
Abstract summary: We develop a strategy to translate core machine learning architectures into their predictive coding equivalents. Our models perform equivalently to backprop on challenging machine learning benchmarks. Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry.
Score: 68.8204255655161
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Backpropagation of error (backprop) is a powerful algorithm for training machine learning architectures through end-to-end differentiation. However, backprop is often criticised for lacking biological plausibility. Recently, it has been shown that backprop in multilayer-perceptrons (MLPs) can be approximated using predictive coding, a biologically-plausible process theory of cortical computation which relies only on local and Hebbian updates. The power of backprop, however, lies not in its instantiation in MLPs, but rather in the concept of automatic differentiation which allows for the optimisation of any differentiable program expressed as a computation graph. Here, we demonstrate that predictive coding converges asymptotically (and in practice rapidly) to exact backprop gradients on arbitrary computation graphs using only local learning rules. We apply this result to develop a straightforward strategy to translate core machine learning architectures into their predictive coding equivalents. We construct predictive coding CNNs, RNNs, and the more complex LSTMs, which include a non-layer-like branching internal graph structure and multiplicative interactions. Our models perform equivalently to backprop on challenging machine learning benchmarks, while utilising only local and (mostly) Hebbian plasticity. Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry, and may also contribute to the development of completely distributed neuromorphic architectures.

Related papers

LASE: Learned Adjacency Spectral Embeddings [7.612218105739107]
We learn nodal Adjacency Spectral Embeddings (ASE) from graph inputs. LASE is interpretable, parameter efficient, robust to inputs with unobserved edges. LASE layers combine Graph Convolutional Network (GCN) and fully-connected Graph Attention Network (GAT) modules.
arXiv Detail & Related papers (2024-12-23T17:35:19Z)
pyhgf: A neural network library for predictive coding [0.2150989251218736]
texttpyhgf is a Python package for creating, manipulating and sampling dynamic networks for predictive coding. We improve over other frameworks by enclosing the network components as transparent, modular and malleable variables in the message-passing steps. The transparency of core variables can also translate into inference processes that leverage self-organisation principles.
arXiv Detail & Related papers (2024-10-11T19:21:38Z)
Learning incomplete factorization preconditioners for GMRES [1.1519724914285523]
We train a graph neural network to approximate the matrix factorization directly. Applying a graph neural network architecture allows us to ensure that the output itself is sparse. We show their effectiveness in decreasing the number of GMRES iterations and improving the spectral properties on synthetic data.
arXiv Detail & Related papers (2024-09-12T17:55:44Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors. LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task. Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Predictive Coding beyond Gaussian Distributions [38.51699576854394]
Predictive coding (PC) is a neuroscience-inspired method that performs inference on hierarchical Gaussian generative models. These methods fail to keep up with modern neural networks, as they are unable to replicate the dynamics of complex layers and activation functions. We show that our method allows us to train transformer networks and achieve a performance comparable with BP on conditional language models.
arXiv Detail & Related papers (2022-11-07T12:02:05Z)
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation. We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z)
Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points. The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains. We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z)
Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference. It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z)
Relaxing the Constraints on Predictive Coding Models [62.997667081978825]
Predictive coding is an influential theory of cortical function which posits that the principal computation the brain performs is the minimization of prediction errors. Standard implementations of the algorithm still involve potentially neurally implausible features such as identical forward and backward weights, backward nonlinear derivatives, and 1-1 error unit connectivity. In this paper, we show that these features are not integral to the algorithm and can be removed either directly or through learning additional sets of parameters with Hebbian update rules without noticeable harm to learning performance.
arXiv Detail & Related papers (2020-10-02T15:21:37Z)
Activation Relaxation: A Local Dynamical Approximation to Backpropagation in the Brain [62.997667081978825]
Activation Relaxation (AR) is motivated by constructing the backpropagation gradient as the equilibrium point of a dynamical system. Our algorithm converges rapidly and robustly to the correct backpropagation gradients, requires only a single type of computational unit, and can operate on arbitrary computation graphs.
arXiv Detail & Related papers (2020-09-11T11:56:34Z)
Randomized Automatic Differentiation [22.95414996614006]
We develop a general framework and approach for randomized automatic differentiation (RAD) RAD can allow unbiased estimates to be computed with reduced memory in return for variance. We show that RAD converges in fewer iterations than using a small batch size for feedforward networks, and in a similar number for recurrent networks.
arXiv Detail & Related papers (2020-07-20T19:03:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.