Related papers: Beyond Backpropagation: Exploring Innovative Algorithms for Energy-Efficient Deep Neural Network Training

Beyond Backpropagation: Exploring Innovative Algorithms for Energy-Efficient Deep Neural Network Training

URL: http://arxiv.org/abs/2509.19063v1
Date: Tue, 23 Sep 2025 14:27:44 GMT
Title: Beyond Backpropagation: Exploring Innovative Algorithms for Energy-Efficient Deep Neural Network Training
Authors: Przemysław Spyra,
Abstract summary: The paper rigorously investigates three BP-free training methods: the Forward-Forward (FF), Cascaded-Forward (CaFo), and Mono-Forward (MF)<n>MF reduces energy consumption by up to 41% and shortens training time by up to 34%, translating to a measurably smaller carbon footprint as estimated by CodeCarbon.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rising computational and energy demands of deep neural networks (DNNs), driven largely by backpropagation (BP), challenge sustainable AI development. This paper rigorously investigates three BP-free training methods: the Forward-Forward (FF), Cascaded-Forward (CaFo), and Mono-Forward (MF) algorithms, tracing their progression from foundational concepts to a demonstrably superior solution. A robust comparative framework was established: each algorithm was implemented on its native architecture (MLPs for FF and MF, a CNN for CaFo) and benchmarked against an equivalent BP-trained model. Hyperparameters were optimized with Optuna, and consistent early stopping criteria were applied based on validation performance, ensuring all models were optimally tuned before comparison. Results show that MF not only competes with but consistently surpasses BP in classification accuracy on its native MLPs. Its superior generalization stems from converging to a more favorable minimum in the validation loss landscape, challenging the assumption that global optimization is required for state-of-the-art results. Measured at the hardware level using the NVIDIA Management Library (NVML) API, MF reduces energy consumption by up to 41% and shortens training time by up to 34%, translating to a measurably smaller carbon footprint as estimated by CodeCarbon. Beyond this primary result, we present a hardware-level analysis that explains the efficiency gains: exposing FF's architectural inefficiencies, validating MF's computationally lean design, and challenging the assumption that all BP-free methods are inherently more memory-efficient. By documenting the evolution from FF's conceptual groundwork to MF's synthesis of accuracy and sustainability, this work offers a clear, data-driven roadmap for future energy-efficient deep learning.

Related papers

Energy-Efficient Deep Learning Without Backpropagation: A Rigorous Evaluation of Forward-Only Algorithms [0.0]
We present evidence that the Mono-Forward algorithm consistently surpasses optimally tuned BP baseline in classification accuracy.<n>This superior generalization is achieved with profound efficiency gains, including up to 41% less energy consumption and up to 34% faster training.
arXiv Detail & Related papers (2025-11-02T19:48:44Z)
Adaptive Spatial Goodness Encoding: Advancing and Scaling Forward-Forward Learning Without Backpropagation [5.092009068303438]
We propose a new Forward-Forward (FF)-based training framework tailored for convolutional neural networks (CNNs)<n>ASGE features maps to compute spatially-aware goodness rep- resentations at each layer, enabling layer-wise supervision.<n>We present the first successful ap- plication of FF-based training to ImageNet datasets, with Top-1 and Top-5 accuracies of 26.21% and 47.49%.
arXiv Detail & Related papers (2025-09-15T19:38:32Z)
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf Node [49.08777822540483]
Fast feedforward networks (FFFs) exploit the observation that different regions of the input space activate distinct subsets of neurons in wide networks. We propose the incorporation of load balancing and Master Leaf techniques into the FFF architecture to improve performance and simplify the training process.
arXiv Detail & Related papers (2024-05-27T05:06:24Z)
Chasing Fairness in Graphs: A GNN Architecture Perspective [73.43111851492593]
We propose textsfFair textsfMessage textsfPassing (FMP) designed within a unified optimization framework for graph neural networks (GNNs) In FMP, the aggregation is first adopted to utilize neighbors' information and then the bias mitigation step explicitly pushes demographic group node presentation centers together. Experiments on node classification tasks demonstrate that the proposed FMP outperforms several baselines in terms of fairness and accuracy on three real-world datasets.
arXiv Detail & Related papers (2023-12-19T18:00:15Z)
FedNAR: Federated Optimization with Normalized Annealing Regularization [54.42032094044368]
We explore the choices of weight decay and identify that weight decay value appreciably influences the convergence of existing FL algorithms. We develop Federated optimization with Normalized Annealing Regularization (FedNAR), a plug-in that can be seamlessly integrated into any existing FL algorithms.
arXiv Detail & Related papers (2023-10-04T21:11:40Z)
ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models [70.45441031021291]
Large Vision-Language Models (LVLMs) can understand the world comprehensively by integrating rich information from different modalities. LVLMs are often problematic due to their massive computational/energy costs and carbon consumption. We propose Efficient Coarse-to-Fine LayerWise Pruning (ECoFLaP), a two-stage coarse-to-fine weight pruning approach for LVLMs.
arXiv Detail & Related papers (2023-10-04T17:34:00Z)
The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF. Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples. In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z)
Why Batch Normalization Damage Federated Learning on Non-IID Data? [34.06900591666005]
Federated learning (FL) involves training deep neural network (DNN) models at the network edge while protecting the privacy of the edge clients. Batch normalization (BN) has been regarded as a simple and effective means to accelerate the training and improve the capability generalization. Recent findings indicate that BN can significantly impair the performance of FL in the presence of non-i.i.d. data. We present the first convergence analysis to show that under the non-i.i.d. data, the mismatch between the local and global statistical parameters in BN causes the gradient deviation between the local and global models
arXiv Detail & Related papers (2023-01-08T05:24:12Z)
Transformer-based Context Condensation for Boosting Feature Pyramids in Object Detection [77.50110439560152]
Current object detectors typically have a feature pyramid (FP) module for multi-level feature fusion (MFF) We propose a novel and efficient context modeling mechanism that can help existing FPs deliver better MFF results. In particular, we introduce a novel insight that comprehensive contexts can be decomposed and condensed into two types of representations for higher efficiency.
arXiv Detail & Related papers (2022-07-14T01:45:03Z)
Data-driven Optimal Power Flow: A Physics-Informed Machine Learning Approach [6.5382276424254995]
This paper proposes a data-driven approach for optimal power flow (OPF) based on the stacked extreme learning machine (SELM) framework. A data-driven OPF regression framework is developed that decomposes the OPF model features into three stages. Numerical results carried out on IEEE and Polish benchmark systems demonstrate that the proposed method outperforms other alternatives.
arXiv Detail & Related papers (2020-05-31T15:41:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.