ePC: Overcoming Exponential Signal Decay in Deep Predictive Coding Networks
- URL: http://arxiv.org/abs/2505.20137v3
- Date: Mon, 29 Sep 2025 15:58:40 GMT
- Title: ePC: Overcoming Exponential Signal Decay in Deep Predictive Coding Networks
- Authors: Cédric Goemaere, Gaspard Oliviers, Rafal Bogacz, Thomas Demeester,
- Abstract summary: Predictive Coding (PC) offers a biologically plausible alternative to backpropagation for neural network training.<n>This paper identifies the root cause and provides a principled solution.
- Score: 9.400040788307223
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Predictive Coding (PC) offers a biologically plausible alternative to backpropagation for neural network training, yet struggles with deeper architectures. This paper identifies the root cause and provides a principled solution. We uncover that the canonical state-based formulation of PC (sPC) is, by design, deeply inefficient on digital hardware, due to an inherent signal decay problem that scales exponentially with depth. To address this fundamental limitation, we introduce a novel reparameterization of PC, named error-based PC (ePC), which does not suffer from signal decay. By optimizing over prediction errors rather than states, ePC enables signals to reach all layers simultaneously and unattenuated, converging orders of magnitude faster than sPC. Experiments across multiple architectures and datasets demonstrate that ePC matches backpropagation's performance even for deeper models where sPC struggles. Besides practical improvements, our work provides theoretical insight into PC dynamics and establishes a foundation for scaling bio-inspired learning to deeper architectures on digital hardware and beyond.
Related papers
- Accelerated Predictive Coding Networks via Direct Kolen-Pollack Feedback Alignment [7.328567184271344]
Predictive coding (PC) is a biologically inspired algorithm for training neural networks that relies only on local updates.<n>We propose direct Kolen-Pollack predictive coding (DKP-PC)<n>It simultaneously addresses both feedback delay and exponential decay, yielding a more efficient and scalable variant of PC.
arXiv Detail & Related papers (2026-02-17T13:29:14Z) - Efficient Online Learning with Predictive Coding Networks: Exploiting Temporal Correlations [26.073347035678342]
Predictive Coding (PC) framework offers a biologically plausible alternative with local, Hebbian-like update rules.<n>We present Predictive Coding Network with Temporal Amortization (PCN-TA), which preserves latent states across temporal frames.<n>Experiments on the COIL-20 robotic perception dataset demonstrate that PCN-TA achieves 10% fewer weight updates compared to backpropagation.
arXiv Detail & Related papers (2025-10-29T22:09:53Z) - Towards Scaling Deep Neural Networks with Predictive Coding: Theory and Practice [1.2691047660244335]
Backpropagation (BP) is the standard algorithm for training the deep neural networks that power modern artificial intelligence.<n>This thesis studies an alternative, potentially more efficient brain-inspired algorithm called predictive coding (PC)
arXiv Detail & Related papers (2025-10-24T14:47:49Z) - Optimal Depth of Neural Networks [2.1756081703276]
This paper introduces a formal theoretical framework to address Determining the optimal depth of a neural network.<n>We model the layer-by-layer evolution of hidden representations as a sequential decision process.<n>We propose a novel and practical regularization term, $mathcalL_rm depth$, that encourages the network to learn representations amenable to efficient, early exiting.
arXiv Detail & Related papers (2025-06-20T09:26:01Z) - Neural Collapse is Globally Optimal in Deep Regularized ResNets and Transformers [33.441694038617506]
We prove that global optima of deep regularized transformers and residual networks (ResNets) with LayerNorm trained with cross entropy or mean squared error loss are approximately collapsed.<n>Our theoretical results are supported by experiments on computer vision and language datasets showing that, as the depth grows, neural collapse indeed becomes more prominent.
arXiv Detail & Related papers (2025-05-21T08:16:03Z) - Local Loss Optimization in the Infinite Width: Stable Parameterization of Predictive Coding Networks and Target Propagation [8.35644084613785]
We introduce the maximal update parameterization ($mu$P) in the infinite-width limit for two representative designs of local targets.<n>By analyzing deep linear networks, we found that PC's gradients interpolate between first-order and Gauss-Newton-like gradients.<n>We demonstrate that, in specific standard settings, PC in the infinite-width limit behaves more similarly to the first-order gradient.
arXiv Detail & Related papers (2024-11-04T11:38:27Z) - Tight Stability, Convergence, and Robustness Bounds for Predictive Coding Networks [60.3634789164648]
Energy-based learning algorithms, such as predictive coding (PC), have garnered significant attention in the machine learning community.
We rigorously analyze the stability, robustness, and convergence of PC through the lens of dynamical systems theory.
arXiv Detail & Related papers (2024-10-07T02:57:26Z) - Contrastive Learning in Memristor-based Neuromorphic Systems [55.11642177631929]
Spiking neural networks have become an important family of neuron-based models that sidestep many of the key limitations facing modern-day backpropagation-trained deep networks.
In this work, we design and investigate a proof-of-concept instantiation of contrastive-signal-dependent plasticity (CSDP), a neuromorphic form of forward-forward-based, backpropagation-free learning.
arXiv Detail & Related papers (2024-09-17T04:48:45Z) - Understanding Predictive Coding as an Adaptive Trust-Region Method [0.0]
We develop a theory of PC as an adaptive trust-region (TR) algorithm that uses second-order information.
We show that the learning dynamics of PC can be interpreted as interpolating between BP's loss gradient direction and a TR direction found by the PC inference dynamics.
arXiv Detail & Related papers (2023-05-29T16:25:55Z) - ETLP: Event-based Three-factor Local Plasticity for online learning with
neuromorphic hardware [105.54048699217668]
We show a competitive performance in accuracy with a clear advantage in the computational complexity for Event-Based Three-factor Local Plasticity (ETLP)
We also show that when using local plasticity, threshold adaptation in spiking neurons and a recurrent topology are necessary to learntemporal patterns with a rich temporal structure.
arXiv Detail & Related papers (2023-01-19T19:45:42Z) - Biologically Plausible Learning on Neuromorphic Hardware Architectures [27.138481022472]
Neuromorphic computing is an emerging paradigm that confronts this imbalance by computations directly in analog memories.
This work is the first to compare the impact of different learning algorithms on Compute-In-Memory-based hardware and vice versa.
arXiv Detail & Related papers (2022-12-29T15:10:59Z) - Chordal Sparsity for SDP-based Neural Network Verification [1.9556053645976446]
We focus on improving semidefinite programming (SDP) based techniques for neural network verification.
By leveraging chordal sparsity, we can decompose the primary computational bottleneck of DeepSDP into an equivalent collection of smaller LMIs.
We show that additional analysis of Chordal-DeepSDP allows us to further rewrite its collection of LMIs in a second level of decomposition.
arXiv Detail & Related papers (2022-06-07T17:57:53Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Consistency Training of Multi-exit Architectures for Sensor Data [0.07614628596146598]
We present a novel and architecture-agnostic approach for robust training of multi-exit architectures termed consistent exit training.
We leverage weak supervision to align model output with consistency training and jointly optimize dual-losses in a multi-task learning fashion over the exits in a network.
arXiv Detail & Related papers (2021-09-27T17:11:25Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - Credit Assignment in Neural Networks through Deep Feedback Control [59.14935871979047]
Deep Feedback Control (DFC) is a new learning method that uses a feedback controller to drive a deep neural network to match a desired output target and whose control signal can be used for credit assignment.
The resulting learning rule is fully local in space and time and approximates Gauss-Newton optimization for a wide range of connectivity patterns.
To further underline its biological plausibility, we relate DFC to a multi-compartment model of cortical pyramidal neurons with a local voltage-dependent synaptic plasticity rule, consistent with recent theories of dendritic processing.
arXiv Detail & Related papers (2021-06-15T05:30:17Z) - GradInit: Learning to Initialize Neural Networks for Stable and
Efficient Training [59.160154997555956]
We present GradInit, an automated and architecture method for initializing neural networks.
It is based on a simple agnostic; the variance of each network layer is adjusted so that a single step of SGD or Adam results in the smallest possible loss value.
It also enables training the original Post-LN Transformer for machine translation without learning rate warmup.
arXiv Detail & Related papers (2021-02-16T11:45:35Z) - Phase Retrieval using Expectation Consistent Signal Recovery Algorithm
based on Hypernetwork [73.94896986868146]
Phase retrieval is an important component in modern computational imaging systems.
Recent advances in deep learning have opened up a new possibility for robust and fast PR.
We develop a novel framework for deep unfolding to overcome the existing limitations.
arXiv Detail & Related papers (2021-01-12T08:36:23Z) - A Theoretical Framework for Target Propagation [75.52598682467817]
We analyze target propagation (TP), a popular but not yet fully understood alternative to backpropagation (BP)
Our theory shows that TP is closely related to Gauss-Newton optimization and thus substantially differs from BP.
We provide a first solution to this problem through a novel reconstruction loss that improves feedback weight training.
arXiv Detail & Related papers (2020-06-25T12:07:06Z) - Predictive Coding Approximates Backprop along Arbitrary Computation
Graphs [68.8204255655161]
We develop a strategy to translate core machine learning architectures into their predictive coding equivalents.
Our models perform equivalently to backprop on challenging machine learning benchmarks.
Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry.
arXiv Detail & Related papers (2020-06-07T15:35:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.