Related papers: Generalising E-prop to Deep Networks

Generalising E-prop to Deep Networks

URL: http://arxiv.org/abs/2512.24506v1
Date: Tue, 30 Dec 2025 23:10:12 GMT
Title: Generalising E-prop to Deep Networks
Authors: Beren Millidge,
Abstract summary: Recurrent networks are typically trained with backpropagation through time.<n>BPTT requires storing the history of all states in the network and then replaying them sequentially backwards in time.<n>RTRL proposes an mathematically equivalent alternative where gradient information is propagated forwards in time locally alongside the regular forward pass.<n>E-prop proposes an approximation of RTRL which reduces its complexity to the level of BPTT.
Score: 10.891416812981495
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recurrent networks are typically trained with backpropagation through time (BPTT). However, BPTT requires storing the history of all states in the network and then replaying them sequentially backwards in time. This computation appears extremely implausible for the brain to implement. Real Time Recurrent Learning (RTRL) proposes an mathematically equivalent alternative where gradient information is propagated forwards in time locally alongside the regular forward pass, however it has significantly greater computational complexity than BPTT which renders it impractical for large networks. E-prop proposes an approximation of RTRL which reduces its complexity to the level of BPTT while maintaining a purely online forward update which can be implemented by an eligibility trace at each synapse. However, works on RTRL and E-prop ubiquitously investigate learning in a single layer with recurrent dynamics. However, learning in the brain spans multiple layers and consists of both hierarchal dynamics in depth as well as time. In this mathematical note, we extend the E-prop framework to handle arbitrarily deep networks, deriving a novel recursion relationship across depth which extends the eligibility traces of E-prop to deeper layers. Our results thus demonstrate an online learning algorithm can perform accurate credit assignment across both time and depth simultaneously, allowing the training of deep recurrent networks without backpropagation through time.

Related papers

When Learning Hurts: Fixed-Pole RNN for Real-Time Online Training [58.25341036646294]
We analytically examine why learning recurrent poles does not provide tangible benefits in data and empirically offer real-time learning scenarios.<n>We show that fixed-pole networks achieve superior performance with lower training complexity, making them more suitable for online real-time tasks.
arXiv Detail & Related papers (2026-02-25T00:15:13Z)
ANCRe: Adaptive Neural Connection Reassignment for Efficient Depth Scaling [57.91760520589592]
Scaling network depth has been a central driver behind the success of modern foundation models.<n>This paper revisits the default mechanism for deepening neural networks, namely residual connections.<n>We introduce adaptive neural connection reassignment (ANCRe), a principled and lightweight framework that parameterizes and learns residual connectivities from the data.
arXiv Detail & Related papers (2026-02-09T18:54:18Z)
Auto-Compressing Networks [51.221103189527014]
We introduce Auto-compression Networks (ACNs), an architectural variant where long feedforward connections from each layer replace traditional short residual connections.<n>We show that ACNs exhibit enhanced noise compared to residual networks, superior performance in low-data settings, and mitigate catastrophic forgetting.<n>These findings establish ACNs as a practical approach to developing efficient neural architectures.
arXiv Detail & Related papers (2025-06-11T13:26:09Z)
Fast Training of Recurrent Neural Networks with Stationary State Feedbacks [48.22082789438538]
Recurrent neural networks (RNNs) have recently demonstrated strong performance and faster inference than Transformers.<n>We propose a novel method that replaces BPTT with a fixed gradient feedback mechanism.
arXiv Detail & Related papers (2025-03-29T14:45:52Z)
Real-Time Recurrent Reinforcement Learning [7.737685867200335]
We introduce a biologically plausible RL framework for solving tasks in partially observable Markov decision processes (POMDPs)<n>The proposed algorithm combines three integral parts: (1) A Meta-RL architecture, resembling the mammalian basal ganglia; (2) A biologically plausible reinforcement learning algorithm, exploiting temporal difference learning and eligibility traces to train the policy and the value-function; and (3) An online automatic differentiation algorithm for computing the gradients with respect to parameters of a shared recurrent network backbone.
arXiv Detail & Related papers (2023-11-08T16:56:16Z)
Efficient Real Time Recurrent Learning through combined activity and parameter sparsity [0.5076419064097732]
Backpropagation through time (BPTT) is the standard algorithm for training recurrent neural networks (RNNs) BPTT is unsuited for online learning and presents a challenge for implementation on low-resource real-time systems. We show that recurrent networks exhibiting high activity sparsity can reduce the computational cost of Real-Time Recurrent Learning (RTRL)
arXiv Detail & Related papers (2023-03-10T01:09:04Z)
Scalable Real-Time Recurrent Learning Using Columnar-Constructive Networks [19.248060562241296]
We propose two constraints that make real-time recurrent learning scalable. We show that by either decomposing the network into independent modules or learning the network in stages, we can make RTRL scale linearly with the number of parameters. We demonstrate the effectiveness of our approach over Truncated-BPTT on a prediction benchmark inspired by animal learning and by doing policy evaluation of pre-trained policies for Atari 2600 games.
arXiv Detail & Related papers (2023-01-20T23:17:48Z)
Online Training Through Time for Spiking Neural Networks [66.7744060103562]
Spiking neural networks (SNNs) are promising brain-inspired energy-efficient models. Recent progress in training methods has enabled successful deep SNNs on large-scale tasks with low latency. We propose online training through time (OTTT) for SNNs, which is derived from BPTT to enable forward-in-time learning.
arXiv Detail & Related papers (2022-10-09T07:47:56Z)
Accurate online training of dynamical spiking neural networks through Forward Propagation Through Time [1.8515971640245998]
We show how a recently developed alternative to BPTT can be applied in spiking neural networks. FPTT attempts to minimize an ongoing dynamically regularized risk on the loss. We show that SNNs trained with FPTT outperform online BPTT approximations, and approach or exceed offline BPTT accuracy on temporal classification tasks.
arXiv Detail & Related papers (2021-12-20T13:44:20Z)
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time. We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both. Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z)
Hybrid Backpropagation Parallel Reservoir Networks [8.944918753413827]
We propose a novel hybrid network, which combines the effectiveness of learning random temporal features of reservoirs with the readout power of a deep neural network with batch normalization. We demonstrate that our new network outperforms LSTMs and GRUs, including multi-layer "deep" versions of these networks. We show also that the inclusion of a novel meta-ring structure, which we call HBP-ESN M-Ring, achieves similar performance to one large reservoir while decreasing the memory required by an order of magnitude.
arXiv Detail & Related papers (2020-10-27T21:03:35Z)
Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.