Energy-convergence trade off for the training of neural networks on bio-inspired hardware
- URL: http://arxiv.org/abs/2509.18121v1
- Date: Wed, 10 Sep 2025 15:40:00 GMT
- Title: Energy-convergence trade off for the training of neural networks on bio-inspired hardware
- Authors: Nikhil Garg, Paul Uriarte Vicandi, Yanming Zhang, Alexandre Baigol, Donato Francesco Falcone, Saketh Ram Mamidala, Bert Jan Offrein, Laura Bégon-Lours,
- Abstract summary: Emerging memristive devices promise to accelerate neural network training by eliminating costly data transfers between compute and memory.<n>We investigate ferroelectric synaptic devices based on HfO2/ZrO2 superlattices and feed their experimentally measured weight updates into hardware-aware neural network simulations.<n>Across pulse widths from 20 ns to 0.2 ms, shorter pulses lower per-update energy but still reduce total energy without sacrificing accuracy.
- Score: 35.74007073601019
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: The increasing deployment of wearable sensors and implantable devices is shifting AI processing demands to the extreme edge, necessitating ultra-low power for continuous operation. Inspired by the brain, emerging memristive devices promise to accelerate neural network training by eliminating costly data transfers between compute and memory. Though, balancing performance and energy efficiency remains a challenge. We investigate ferroelectric synaptic devices based on HfO2/ZrO2 superlattices and feed their experimentally measured weight updates into hardware-aware neural network simulations. Across pulse widths from 20 ns to 0.2 ms, shorter pulses lower per-update energy but require more training epochs while still reducing total energy without sacrificing accuracy. Classification accuracy using plain stochastic gradient descent (SGD) is diminished compared to mixed-precision SGD. We analyze the causes and propose a ``symmetry point shifting'' technique, addressing asymmetric updates and restoring accuracy. These results highlight a trade-off among accuracy, convergence speed, and energy use, showing that short-pulse programming with tailored training significantly enhances on-chip learning efficiency.
Related papers
- An Exact Gradient Framework for Training Spiking Neural Networks [0.7366405857677227]
Spiking neural networks inherently rely on the precise timing of discrete spike events for information processing.<n>We propose an event-driven learning framework that computes exact loss gradients with respect to synaptic weights and transmission delays.<n>Experiments on multiple benchmarks demonstrate significant gains in accuracy (up to 7%), timing precision, and robustness compared to existing methods.
arXiv Detail & Related papers (2025-07-08T11:55:27Z) - Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
Neuromorphic computing uses spiking neural networks (SNNs) to perform inference tasks.<n> embedding a small payload within each spike exchanged between spiking neurons can enhance inference accuracy without increasing energy consumption.<n> split computing - where an SNN is partitioned across two devices - is a promising solution.<n>This paper presents the first comprehensive study of a neuromorphic wireless split computing architecture that employs multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z) - Low-power event-based face detection with asynchronous neuromorphic
hardware [2.0774873363739985]
We present the first instance of an on-chip spiking neural network for event-based face detection deployed on the SynSense Speck neuromorphic chip.
We show how to reduce precision discrepancies between off-chip clock-driven simulation used for training and on-chip event-driven inference.
We achieve an on-chip face detection mAP[0.5] of 0.6 while consuming only 20 mW.
arXiv Detail & Related papers (2023-12-21T19:23:02Z) - Gradual Optimization Learning for Conformational Energy Minimization [69.36925478047682]
Gradual Optimization Learning Framework (GOLF) for energy minimization with neural networks significantly reduces the required additional data.
Our results demonstrate that the neural network trained with GOLF performs on par with the oracle on a benchmark of diverse drug-like molecules.
arXiv Detail & Related papers (2023-11-05T11:48:08Z) - Evaluating Spiking Neural Network On Neuromorphic Platform For Human
Activity Recognition [2.710807780228189]
Energy efficiency and low latency are crucial requirements for wearable AI-empowered human activity recognition systems.
Spike-based workouts recognition system can achieve a comparable accuracy to popular milliwatt RISC-V bases multi-core processor GAP8 with a traditional neural network.
arXiv Detail & Related papers (2023-08-01T18:59:06Z) - Block-Wise Dynamic-Precision Neural Network Training Acceleration via
Online Quantization Sensitivity Analytics [8.373265629267257]
We propose DYNASTY, a block-wise dynamic-precision neural network training framework.
DYNASTY provides accurate data sensitivity information through fast online analytics, and maintains stable training convergence with an adaptive bit-width map generator.
Compared to 8-bit quantization baseline, DYNASTY brings up to $5.1times$ speedup and $4.7times$ energy consumption reduction with no accuracy drop and negligible hardware overhead.
arXiv Detail & Related papers (2022-10-31T03:54:16Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - SOUL: An Energy-Efficient Unsupervised Online Learning Seizure Detection
Classifier [68.8204255655161]
Implantable devices that record neural activity and detect seizures have been adopted to issue warnings or trigger neurostimulation to suppress seizures.
For an implantable seizure detection system, a low power, at-the-edge, online learning algorithm can be employed to dynamically adapt to neural signal drifts.
SOUL was fabricated in TSMC's 28 nm process occupying 0.1 mm2 and achieves 1.5 nJ/classification energy efficiency, which is at least 24x more efficient than state-of-the-art.
arXiv Detail & Related papers (2021-10-01T23:01:20Z) - Low-Precision Training in Logarithmic Number System using Multiplicative
Weight Update [49.948082497688404]
Training large-scale deep neural networks (DNNs) currently requires a significant amount of energy, leading to serious environmental impacts.
One promising approach to reduce the energy costs is representing DNNs with low-precision numbers.
We jointly design a lowprecision training framework involving a logarithmic number system (LNS) and a multiplicative weight update training method, termed LNS-Madam.
arXiv Detail & Related papers (2021-06-26T00:32:17Z) - Surrogate gradients for analog neuromorphic computing [2.6475944316982942]
We show that learning self-corrects for device mismatch resulting in competitive spiking network performance on vision and speech benchmarks.
Our work sets several new benchmarks for low-energy spiking network processing on analog neuromorphic hardware.
arXiv Detail & Related papers (2020-06-12T14:45:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.