Related papers: Scalable Equilibrium Propagation via Intermediate Error Signals for Deep Convolutional CRNNs

Scalable Equilibrium Propagation via Intermediate Error Signals for Deep Convolutional CRNNs

URL: http://arxiv.org/abs/2508.15989v1
Date: Thu, 21 Aug 2025 22:19:30 GMT
Title: Scalable Equilibrium Propagation via Intermediate Error Signals for Deep Convolutional CRNNs
Authors: Jiaqi Lin, Malyaban Bal, Abhronil Sengupta,
Abstract summary: Equilibrium Propagation (EP) is a biologically inspired local learning rule first proposed for convergent recurrent neural networks (CRNNs)<n>EP estimates gradients that closely align with those computed by Backpropagation Through Time (BPTT) while significantly reducing computational demands.<n>We propose a novel EP framework that incorporates intermediate error signals to enhance information flow and convergence of neuron dynamics.
Score: 17.067785532606724
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Equilibrium Propagation (EP) is a biologically inspired local learning rule first proposed for convergent recurrent neural networks (CRNNs), in which synaptic updates depend only on neuron states from two distinct phases. EP estimates gradients that closely align with those computed by Backpropagation Through Time (BPTT) while significantly reducing computational demands, positioning it as a potential candidate for on-chip training in neuromorphic architectures. However, prior studies on EP have been constrained to shallow architectures, as deeper networks suffer from the vanishing gradient problem, leading to convergence difficulties in both energy minimization and gradient computation. To address the vanishing gradient problem in deep EP networks, we propose a novel EP framework that incorporates intermediate error signals to enhance information flow and convergence of neuron dynamics. This is the first work to integrate knowledge distillation and local error signals into EP, enabling the training of significantly deeper architectures. Our proposed approach achieves state-of-the-art performance on the CIFAR-10 and CIFAR-100 datasets, showcasing its scalability on deep VGG architectures. These results represent a significant advancement in the scalability of EP, paving the way for its application in real-world systems.

Related papers

Architecture-Optimization Co-Design for Physics-Informed Neural Networks Via Attentive Representations and Conflict-Resolved Gradients [5.447935819547941]
We study PINN training from a unified architecture-optimization perspective.<n>We propose a layer-wise dynamic attention mechanism to enhance representational flexibility.<n>We then reformulate PINN training as a multi-task learning problem and introduce a conflict-resolved gradient update strategy.
arXiv Detail & Related papers (2026-01-19T11:32:25Z)
Cannistraci-Hebb Training on Ultra-Sparse Spiking Neural Networks [10.30800655748035]
Spiking neural networks (SNNs) inherently possess temporal activation sparsity.<n>Existing methods fail to achieve ultra-sparse network structures without significant performance loss.<n>We propose the Cannistraci-Hebb Spiking Neural Network (CH-SNN), a novel and generalizable dynamic sparse training framework for SNNs.
arXiv Detail & Related papers (2025-11-05T07:59:19Z)
Geminet: Learning the Duality-based Iterative Process for Lightweight Traffic Engineering in Changing Topologies [53.38648279089736]
Geminet is a lightweight and scalable ML-based TE framework that can handle changing topologies.<n>Its neural network size is only 0.04% to 7% of existing schemes.<n>When trained on large-scale topologies, Geminet consumes under 10 GiB of memory, more than eight times less than the 80-plus GiB required by HARP.
arXiv Detail & Related papers (2025-06-30T09:09:50Z)
Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval.<n>A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed.<n>The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
Residual resampling-based physics-informed neural network for neutron diffusion equations [7.105073499157097]
The neutron diffusion equation plays a pivotal role in the analysis of nuclear reactors. Traditional PINN approaches often utilize fully connected network (FCN) architecture. R2-PINN effectively overcomes the limitations inherent in current methods, providing more accurate and robust solutions for neutron diffusion equations.
arXiv Detail & Related papers (2024-06-23T13:49:31Z)
Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation [70.43845294145714]
Relieving the reliance of neural network training on a global back-propagation (BP) has emerged as a notable research topic. We propose a local training strategy that successively regularizes the gradient reconciliation between neighboring modules. Our method can be integrated into both local-BP and BP-free settings.
arXiv Detail & Related papers (2024-06-07T19:10:31Z)
Neural variational Data Assimilation with Uncertainty Quantification using SPDE priors [28.804041716140194]
Recent advances in the deep learning community enables to address the problem through a neural architecture a variational data assimilation framework.<n>In this work we use the theory of Partial Differential Equations (SPDE) and Gaussian Processes (GP) to estimate both space-and time covariance of the state.
arXiv Detail & Related papers (2024-02-02T19:18:12Z)
Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training. We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z)
Holomorphic Equilibrium Propagation Computes Exact Gradients Through Finite Size Oscillations [5.279475826661643]
Equilibrium propagation (EP) is an alternative to backpropagation (BP) that allows the training of deep neural networks with local learning rules. We show analytically that this extension naturally leads to exact gradients even for finite-amplitude teaching signals. We establish the first benchmark for EP on the ImageNet 32x32 dataset and show that it matches the performance of an equivalent network trained with BP.
arXiv Detail & Related papers (2022-09-01T15:23:49Z)
Scaling Equilibrium Propagation to Deep ConvNets by Drastically Reducing its Gradient Estimator Bias [62.43908463620527]
In practice, EP does not scale to visual tasks harder than MNIST. We show that a bias in the gradient estimate of EP, inherent in the use of finite nudging, is responsible for this phenomenon. These results highlight EP as a scalable approach to compute error gradients in deep neural networks, thereby motivating its hardware implementation.
arXiv Detail & Related papers (2021-01-14T10:23:40Z)
Scaling Equilibrium Propagation to Deep ConvNets by Drastically Reducing its Gradient Estimator Bias [65.13042449121411]
In practice, training a network with the gradient estimates provided by EP does not scale to visual tasks harder than MNIST. We show that a bias in the gradient estimate of EP, inherent in the use of finite nudging, is responsible for this phenomenon. We apply these techniques to train an architecture with asymmetric forward and backward connections, yielding a 13.2% test error.
arXiv Detail & Related papers (2020-06-06T09:36:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.