Related papers: Adaptive Spatial Goodness Encoding: Advancing and Scaling Forward-Forward Learning Without Backpropagation

Adaptive Spatial Goodness Encoding: Advancing and Scaling Forward-Forward Learning Without Backpropagation

URL: http://arxiv.org/abs/2509.12394v1
Date: Mon, 15 Sep 2025 19:38:32 GMT
Title: Adaptive Spatial Goodness Encoding: Advancing and Scaling Forward-Forward Learning Without Backpropagation
Authors: Qingchun Gong, Robert Bogdan Staszewski, Kai Xu,
Abstract summary: We propose a new Forward-Forward (FF)-based training framework tailored for convolutional neural networks (CNNs)<n>ASGE features maps to compute spatially-aware goodness rep- resentations at each layer, enabling layer-wise supervision.<n>We present the first successful ap- plication of FF-based training to ImageNet datasets, with Top-1 and Top-5 accuracies of 26.21% and 47.49%.
Score: 5.092009068303438
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Forward-Forward (FF) algorithm offers a promising al- ternative to backpropagation (BP). Despite advancements in recent FF-based extensions, which have enhanced the origi- nal algorithm and adapted it to convolutional neural networks (CNNs), they often suffer from limited representational ca- pacity and poor scalability to large-scale datasets, primarily due to exploding channel dimensionality. In this work, we propose adaptive spatial goodness encoding (ASGE), a new FF-based training framework tailored for CNNs. ASGE lever- ages feature maps to compute spatially-aware goodness rep- resentations at each layer, enabling layer-wise supervision. Crucially, this approach decouples classification complexity from channel dimensionality, thereby addressing the issue of channel explosion and achieving competitive performance compared to other BP-free methods. ASGE outperforms all other FF-based approaches across multiple benchmarks, delivering test accuracies of 99.65% on MNIST, 93.41% on FashionMNIST, 90.62% on CIFAR-10, and 65.42% on CIFAR-100. Moreover, we present the first successful ap- plication of FF-based training to ImageNet, with Top-1 and Top-5 accuracies of 26.21% and 47.49%. By entirely elimi- nating BP and significantly narrowing the performance gap with BP-trained models, the ASGE framework establishes a viable foundation toward scalable BP-free CNN training.

Related papers

Beyond Backpropagation: Exploring Innovative Algorithms for Energy-Efficient Deep Neural Network Training [0.0]
The paper rigorously investigates three BP-free training methods: the Forward-Forward (FF), Cascaded-Forward (CaFo), and Mono-Forward (MF)<n>MF reduces energy consumption by up to 41% and shortens training time by up to 34%, translating to a measurably smaller carbon footprint as estimated by CodeCarbon.
arXiv Detail & Related papers (2025-09-23T14:27:44Z)
DiffusionNFT: Online Diffusion Reinforcement with Forward Process [99.94852379720153]
Diffusion Negative-aware FineTuning (DiffusionNFT) is a new online RL paradigm that optimize diffusion models directly on the forward process via flow matching.<n>DiffusionNFT is up to $25times$ more efficient than FlowGRPO in head-to-head comparisons, while being CFG-free.
arXiv Detail & Related papers (2025-09-19T16:09:33Z)
Boosted Training of Lightweight Early Exits for Optimizing CNN Image Classification Inference [47.027290803102666]
We introduce a sequential training approach that aligns branch training with inference-time data distributions.<n>Experiments on the CINIC-10 dataset with a ResNet18 backbone demonstrate that BTS-EE consistently outperforms non-boosted training.<n>These results offer practical efficiency gains for applications such as industrial inspection, embedded vision, and UAV-based monitoring.
arXiv Detail & Related papers (2025-09-10T06:47:49Z)
FFGAF-SNN: The Forward-Forward Based Gradient Approximation Free Training Framework for Spiking Neural Networks [7.310627646090302]
Spiking Neural Networks (SNNs) offer a biologically plausible framework for energy-efficient neuromorphic computing.<n>It is a challenge to train SNNs due to their non-differentiability, efficiently.<n>We propose a Forward-Forward (FF) based gradient approximation-free training framework for Spiking Neural Networks.
arXiv Detail & Related papers (2025-07-31T15:22:23Z)
Distance-Forward Learning: Enhancing the Forward-Forward Algorithm Towards High-Performance On-Chip Learning [20.037634881772842]
Forward-Forward (FF) algorithm was recently proposed as a local learning method to address the limitations of backpropagation (BP) We reformulate FF using distance metric learning and propose a distance-forward algorithm (DF) to improve FF performance in supervised vision tasks. Our method surpasses existing FF models and other advanced local learning approaches, with accuracies of 99.7% on MNIST, 88.2% on CIFAR-10, 59% on CIFAR-100, 95.9% on SVHN, and 82.5% on ImageNette.
arXiv Detail & Related papers (2024-08-27T10:01:43Z)
Convolutional Channel-wise Competitive Learning for the Forward-Forward Algorithm [5.1246638322893245]
Forward-Forward (FF) algorithm has been proposed to alleviate the issues of backpropagation (BP) commonly used to train deep neural networks. We take the main ideas of FF and improve them by leveraging channel-wise competitive learning in the context of convolutional neural networks for image classification tasks. Our method outperforms recent FF-based models on image classification tasks, achieving testing errors of 0.58%, 7.69%, 21.89%, and 48.77% on MNIST, Fashion-MNIST, CIFAR-10 and CIFAR-100 respectively.
arXiv Detail & Related papers (2023-12-19T23:48:43Z)
The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF. Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples. In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z)
WSEBP: A Novel Width-depth Synchronous Extension-based Basis Pursuit Algorithm for Multi-Layer Convolutional Sparse Coding [4.521915878576165]
Multi-layer convolutional sparse coding (ML-CSC) can interpret the convolutional neural networks (CNNs) Many current state-of-art (SOTA) pursuit algorithms require multiple iterations to optimize the solution of ML-CSC. We propose a novel width-depth synchronous extension-based basis pursuit (WSEBP) algorithm which solves the ML-CSC problem without the limitation of the number of iterations.
arXiv Detail & Related papers (2022-03-28T15:53:52Z)
Distillation Guided Residual Learning for Binary Convolutional Neural Networks [83.6169936912264]
It is challenging to bridge the performance gap between Binary CNN (BCNN) and Floating point CNN (FCNN) We observe that, this performance gap leads to substantial residuals between intermediate feature maps of BCNN and FCNN. To minimize the performance gap, we enforce BCNN to produce similar intermediate feature maps with the ones of FCNN. This training strategy, i.e., optimizing each binary convolutional block with block-wise distillation loss derived from FCNN, leads to a more effective optimization to BCNN.
arXiv Detail & Related papers (2020-07-10T07:55:39Z)
Belief Propagation Neural Networks [103.97004780313105]
We introduce belief propagation neural networks (BPNNs) BPNNs operate on factor graphs and generalize Belief propagation (BP) We show that BPNNs converges 1.7x faster on Ising models while providing tighter bounds. On challenging model counting problems, BPNNs compute estimates 100's of times faster than state-of-the-art handcrafted methods.
arXiv Detail & Related papers (2020-07-01T07:39:51Z)
Attentive CutMix: An Enhanced Data Augmentation Approach for Deep Learning Based Image Classification [58.20132466198622]
We propose Attentive CutMix, a naturally enhanced augmentation strategy based on CutMix. In each training iteration, we choose the most descriptive regions based on the intermediate attention maps from a feature extractor. Our proposed method is simple yet effective, easy to implement and can boost the baseline significantly.
arXiv Detail & Related papers (2020-03-29T15:01:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.