Dependence of Equilibrium Propagation Training Success on Network Architecture
- URL: http://arxiv.org/abs/2601.21945v1
- Date: Thu, 29 Jan 2026 16:29:31 GMT
- Title: Dependence of Equilibrium Propagation Training Success on Network Architecture
- Authors: Qingshan Wang, Clara C. Wanjura, Florian Marquardt,
- Abstract summary: We investigate the performance of the widespread physics-based training method of equilibrium propagation for locally connected lattices.<n>Our results show that sparse networks with only local connections can achieve performance comparable to dense networks.
- Score: 2.0123602216287027
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid rise of artificial intelligence has led to an unsustainable growth in energy consumption. This has motivated progress in neuromorphic computing and physics-based training of learning machines as alternatives to digital neural networks. Many theoretical studies focus on simple architectures like all-to-all or densely connected layered networks. However, these may be challenging to realize experimentally, e.g. due to connectivity constraints. In this work, we investigate the performance of the widespread physics-based training method of equilibrium propagation for more realistic architectural choices, specifically, locally connected lattices. We train an XY model and explore the influence of architecture on various benchmark tasks, tracking the evolution of spatially distributed responses and couplings during training. Our results show that sparse networks with only local connections can achieve performance comparable to dense networks. Our findings provide guidelines for further scaling up architectures based on equilibrium propagation in realistic settings.
Related papers
- Machine Learning and Control: Foundations, Advances, and Perspectives [0.0]
We show that concepts such as simultaneous and ensemble controllability offer new insights into the classification and representation properties of deep neural networks.<n>We also explore the relationship between dynamic and static neural networks, where depth is traded for width.<n>We describe how classical properties of diffusion processes, long established in the context of partial differential equations, contribute to explaining the success of modern generative artificial intelligence.
arXiv Detail & Related papers (2025-09-30T10:47:26Z) - Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning [57.3885832382455]
We show that introducing static network sparsity alone can unlock further scaling potential beyond dense counterparts with state-of-the-art architectures.<n>Our analysis reveals that, in contrast to naively scaling up dense DRL networks, such sparse networks achieve both higher parameter efficiency for network expressivity.
arXiv Detail & Related papers (2025-06-20T17:54:24Z) - Auto-Compressing Networks [51.221103189527014]
We introduce Auto-compression Networks (ACNs), an architectural variant where long feedforward connections from each layer replace traditional short residual connections.<n>We show that ACNs exhibit enhanced noise compared to residual networks, superior performance in low-data settings, and mitigate catastrophic forgetting.<n>These findings establish ACNs as a practical approach to developing efficient neural architectures.
arXiv Detail & Related papers (2025-06-11T13:26:09Z) - Neuromorphic on-chip reservoir computing with spiking neural network architectures [0.562479170374811]
Reservoir computing is a promising approach for harnessing the computational power of recurrent neural networks.
This paper investigates the application of integrate-and-fire neurons within reservoir computing frameworks for two distinct tasks.
We study the reservoir computing performance using a custom integrate-and-fire code, Intel's Lava neuromorphic computing software framework, and via an on-chip implementation in Loihi.
arXiv Detail & Related papers (2024-07-30T05:05:09Z) - Dynamical stability and chaos in artificial neural network trajectories along training [3.379574469735166]
We study the dynamical properties of this process by analyzing through this lens the network trajectories of a shallow neural network.
We find hints of regular and chaotic behavior depending on the learning rate regime.
This work also contributes to the cross-fertilization of ideas between dynamical systems theory, network theory and machine learning.
arXiv Detail & Related papers (2024-04-08T17:33:11Z) - Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z) - The Neural Race Reduction: Dynamics of Abstraction in Gated Networks [12.130628846129973]
We introduce the Gated Deep Linear Network framework that schematizes how pathways of information flow impact learning dynamics.
We derive an exact reduction and, for certain cases, exact solutions to the dynamics of learning.
Our work gives rise to general hypotheses relating neural architecture to learning and provides a mathematical approach towards understanding the design of more complex architectures.
arXiv Detail & Related papers (2022-07-21T12:01:03Z) - Joint Learning of Neural Transfer and Architecture Adaptation for Image
Recognition [77.95361323613147]
Current state-of-the-art visual recognition systems rely on pretraining a neural network on a large-scale dataset and finetuning the network weights on a smaller dataset.
In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness.
Our method can be easily generalized to an unsupervised paradigm by replacing supernet training with self-supervised learning in the source domain tasks and performing linear evaluation in the downstream tasks.
arXiv Detail & Related papers (2021-03-31T08:15:17Z) - Deep learning of contagion dynamics on complex networks [0.0]
We propose a complementary approach based on deep learning to build effective models of contagion dynamics on networks.
By allowing simulations on arbitrary network structures, our approach makes it possible to explore the properties of the learned dynamics beyond the training data.
Our results demonstrate how deep learning offers a new and complementary perspective to build effective models of contagion dynamics on networks.
arXiv Detail & Related papers (2020-06-09T17:18:34Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z) - Understanding the Effects of Data Parallelism and Sparsity on Neural
Network Training [126.49572353148262]
We study two factors in neural network training: data parallelism and sparsity.
Despite their promising benefits, understanding of their effects on neural network training remains elusive.
arXiv Detail & Related papers (2020-03-25T10:49:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.