Related papers: Enabling Incremental Training with Forward Pass for Edge Devices

Enabling Incremental Training with Forward Pass for Edge Devices

URL: http://arxiv.org/abs/2103.14007v1
Date: Thu, 25 Mar 2021 17:43:04 GMT
Title: Enabling Incremental Training with Forward Pass for Edge Devices
Authors: Dana AbdulQader, Shoba Krishnan, Claudionor N. Coelho Jr
Abstract summary: We introduce a method using evolutionary strategy (ES) that can partially retrain the network enabling it to adapt to changes and recover after an error has occurred. This technique enables training on an inference-only hardware without the need to use backpropagation and with minimal resource overhead.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Neural Networks (DNNs) are commonly deployed on end devices that exist in constantly changing environments. In order for the system to maintain it's accuracy, it is critical that it is able to adapt to changes and recover by retraining parts of the network. However, end devices have limited resources making it challenging to train on the same device. Moreover, training deep neural networks is both memory and compute intensive due to the backpropagation algorithm. In this paper we introduce a method using evolutionary strategy (ES) that can partially retrain the network enabling it to adapt to changes and recover after an error has occurred. This technique enables training on an inference-only hardware without the need to use backpropagation and with minimal resource overhead. We demonstrate the ability of our technique to retrain a quantized MNIST neural network after injecting noise to the input. Furthermore, we present the micro-architecture required to enable training on HLS4ML (an inference hardware architecture) and implement it in Verilog. We synthesize our implementation for a Xilinx Kintex Ultrascale Field Programmable Gate Array (FPGA) resulting in less than 1% resource utilization required to implement the incremental training.

Related papers

NeRF-based CBCT Reconstruction needs Normalization and Initialization [53.58395475423445]
NeRF-based methods suffer from a local-global training mismatch between their two key components: the hash encoder and the neural network.<n>We introduce a Normalized Hash, which enhances feature consistency and mitigates the mismatch.<n>The neural network exhibits improved stability during early training, enabling faster convergence and enhanced reconstruction performance.
arXiv Detail & Related papers (2025-06-24T16:01:45Z)
Self-training superconducting neuromorphic circuits using reinforcement learning rules [0.0]
This paper describes a set of reinforcement learning-based local weight update rules and their implementation in superconducting hardware. We implement a small-scale neural network with a learning time of order one nanosecond. The adjustment of weights is based on a global reinforcement signal that obviates the need for circuitry to back-propagate errors.
arXiv Detail & Related papers (2024-04-29T15:09:00Z)
Synaptic metaplasticity with multi-level memristive devices [1.5598974049838272]
We propose a memristor-based hardware solution for implementing metaplasticity during both inference and training. We show that a two-layer perceptron achieves 97% and 86% accuracy on consecutive training of MNIST and Fashion-MNIST. Our architecture is compatible with the memristor limited endurance and has a 15x reduction in memory.
arXiv Detail & Related papers (2023-06-21T09:40:25Z)
Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems. We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z)
Loss shaping enhances exact gradient learning with Eventprop in spiking neural networks [0.1350479308585481]
Eventprop is an algorithm for gradient descent on exact gradients in spiking neural networks. We implement Eventprop in the GPU-enhanced Neural Networks framework. We train spiking neural networks on Spiking Heidelberg Digits and Spiking Speech Commands datasets.
arXiv Detail & Related papers (2022-12-02T15:20:58Z)
CorrectNet: Robustness Enhancement of Analog In-Memory Computing for Neural Networks by Error Suppression and Compensation [4.570841222958966]
We propose a framework to enhance the robustness of neural networks under variations and noise. We show that inference accuracy of neural networks can be recovered from as low as 1.69% under variations and noise.
arXiv Detail & Related papers (2022-11-27T19:13:33Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
On-Device Training Under 256KB Memory [62.95579393237751]
We propose an algorithm-system co-design framework to make on-device training possible with only 256KB of memory. Our framework is the first solution to enable tiny on-device training of convolutional neural networks under 256KB and 1MB Flash.
arXiv Detail & Related papers (2022-06-30T17:59:08Z)
Learning in Feedback-driven Recurrent Spiking Neural Networks using full-FORCE Training [4.124948554183487]
We propose a supervised training procedure for RSNNs, where a second network is introduced only during the training. The proposed training procedure consists of generating targets for both recurrent and readout layers. We demonstrate the improved performance and noise robustness of the proposed full-FORCE training procedure to model 8 dynamical systems.
arXiv Detail & Related papers (2022-05-26T19:01:19Z)
FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task. The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources. It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
Lossless Compression of Deep Neural Networks [17.753357839478575]
Deep neural networks have been successful in many predictive modeling tasks, such as image and language recognition. It is challenging to deploy these networks under limited computational resources, such as in mobile devices. We introduce an algorithm that removes units and layers of a neural network while not changing the output that is produced.
arXiv Detail & Related papers (2020-01-01T15:04:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.