Towards Scaling Deep Neural Networks with Predictive Coding: Theory and Practice
- URL: http://arxiv.org/abs/2510.23323v2
- Date: Wed, 29 Oct 2025 16:59:39 GMT
- Title: Towards Scaling Deep Neural Networks with Predictive Coding: Theory and Practice
- Authors: Francesco Innocenti,
- Abstract summary: Backpropagation (BP) is the standard algorithm for training the deep neural networks that power modern artificial intelligence.<n>This thesis studies an alternative, potentially more efficient brain-inspired algorithm called predictive coding (PC)
- Score: 1.2691047660244335
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Backpropagation (BP) is the standard algorithm for training the deep neural networks that power modern artificial intelligence including large language models. However, BP is energy inefficient and unlikely to be implemented by the brain. This thesis studies an alternative, potentially more efficient brain-inspired algorithm called predictive coding (PC). Unlike BP, PC networks (PCNs) perform inference by iterative equilibration of neuron activities before learning or weight updates. Recent work has suggested that this iterative inference procedure provides a range of benefits over BP, such as faster training. However, these advantages have not been consistently observed, the inference and learning dynamics of PCNs are still poorly understood, and deep PCNs remain practically untrainable. Here, we make significant progress towards scaling PCNs by taking a theoretical approach grounded in optimisation theory. First, we show that the learning dynamics of PC can be understood as an approximate trust-region method using second-order information, despite explicitly using only first-order local updates. Second, going beyond this approximation, we show that PC can in principle make use of arbitrarily higher-order information, such that for feedforward networks the effective landscape on which PC learns is far more benign and robust to vanishing gradients than the (mean squared error) loss landscape. Third, motivated by a study of the inference dynamics of PCNs, we propose a new parameterisation called "$\mu$PC", which for the first time allows stable training of 100+ layer networks with little tuning and competitive performance on simple tasks. Overall, this thesis significantly advances our fundamental understanding of the inference and learning dynamics of PCNs, while highlighting the need for future research to focus on hardware co-design if PC is to compete with BP at scale.
Related papers
- On the Infinite Width and Depth Limits of Predictive Coding Networks [8.779034498638826]
Predictive coding (PC) is a biologically plausible alternative to standard backpropagation (BP)<n>Recent work has improved the training stability of deep PC networks.<n>We study the infinite width and depth limits of PCNs.
arXiv Detail & Related papers (2026-02-07T20:47:32Z) - Tight Stability, Convergence, and Robustness Bounds for Predictive Coding Networks [60.3634789164648]
Energy-based learning algorithms, such as predictive coding (PC), have garnered significant attention in the machine learning community.
We rigorously analyze the stability, robustness, and convergence of PC through the lens of dynamical systems theory.
arXiv Detail & Related papers (2024-10-07T02:57:26Z) - Predictive Coding Networks and Inference Learning: Tutorial and Survey [0.7510165488300368]
Predictive coding networks (PCNs) are based on the neuroscientific framework of predictive coding.
Unlike traditional neural networks trained with backpropagation (BP), PCNs utilize inference learning (IL), a more biologically plausible algorithm.
As inherently probabilistic (graphical) latent variable models, PCNs provide a versatile framework for both supervised learning and unsupervised (generative) modeling.
arXiv Detail & Related papers (2024-07-04T18:39:20Z) - Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z) - Brain-inspired Computational Intelligence via Predictive Coding [73.42407863671565]
Predictive coding (PC) has shown promising properties that make it potentially valuable for the machine learning community.<n>PC-like algorithms are starting to be present in multiple sub-fields of machine learning and AI at large.
arXiv Detail & Related papers (2023-08-15T16:37:16Z) - Understanding Predictive Coding as an Adaptive Trust-Region Method [0.0]
We develop a theory of PC as an adaptive trust-region (TR) algorithm that uses second-order information.
We show that the learning dynamics of PC can be interpreted as interpolating between BP's loss gradient direction and a TR direction found by the PC inference dynamics.
arXiv Detail & Related papers (2023-05-29T16:25:55Z) - A Theoretical Framework for Inference and Learning in Predictive Coding
Networks [41.58529335439799]
Predictive coding (PC) is an influential theory in computational neuroscience.
We provide a comprehensive theoretical analysis of the properties of PCNs trained with prospective configuration.
arXiv Detail & Related papers (2022-07-21T04:17:55Z) - Credit Assignment in Neural Networks through Deep Feedback Control [59.14935871979047]
Deep Feedback Control (DFC) is a new learning method that uses a feedback controller to drive a deep neural network to match a desired output target and whose control signal can be used for credit assignment.
The resulting learning rule is fully local in space and time and approximates Gauss-Newton optimization for a wide range of connectivity patterns.
To further underline its biological plausibility, we relate DFC to a multi-compartment model of cortical pyramidal neurons with a local voltage-dependent synaptic plasticity rule, consistent with recent theories of dendritic processing.
arXiv Detail & Related papers (2021-06-15T05:30:17Z) - Predictive Coding Can Do Exact Backpropagation on Convolutional and
Recurrent Neural Networks [40.51949948934705]
Predictive coding networks (PCNs) are an influential model for information processing in the brain.
BP is commonly regarded to be the most successful learning method in modern machine learning.
We show that a biologically plausible algorithm is able to exactly replicate the accuracy of BP on complex architectures.
arXiv Detail & Related papers (2021-03-05T14:57:01Z) - A Theoretical Framework for Target Propagation [75.52598682467817]
We analyze target propagation (TP), a popular but not yet fully understood alternative to backpropagation (BP)
Our theory shows that TP is closely related to Gauss-Newton optimization and thus substantially differs from BP.
We provide a first solution to this problem through a novel reconstruction loss that improves feedback weight training.
arXiv Detail & Related papers (2020-06-25T12:07:06Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z) - Large-Scale Gradient-Free Deep Learning with Recursive Local
Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources.
Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize.
We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.