ProAct: Progressive Training for Hybrid Clipped Activation Function to Enhance Resilience of DNNs
- URL: http://arxiv.org/abs/2406.06313v1
- Date: Mon, 10 Jun 2024 14:31:38 GMT
- Title: ProAct: Progressive Training for Hybrid Clipped Activation Function to Enhance Resilience of DNNs
- Authors: Seyedhamidreza Mousavi, Mohammad Hasan Ahmadilivani, Jaan Raik, Maksim Jenihhin, Masoud Daneshtalab,
- Abstract summary: State-of-the-art methods offer either neuron-wise or layer-wise clipping activation functions.
Layer-wise clipped activation functions cannot preserve DNNs resilience at high bit error rates.
We propose a hybrid clipped activation function that integrates neuron-wise and layer-by-layer methods.
- Score: 0.4660328753262075
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Deep Neural Networks (DNNs) are extensively employed in safety-critical applications where ensuring hardware reliability is a primary concern. To enhance the reliability of DNNs against hardware faults, activation restriction techniques significantly mitigate the fault effects at the DNN structure level, irrespective of accelerator architectures. State-of-the-art methods offer either neuron-wise or layer-wise clipping activation functions. They attempt to determine optimal clipping thresholds using heuristic and learning-based approaches. Layer-wise clipped activation functions cannot preserve DNNs resilience at high bit error rates. On the other hand, neuron-wise clipping activation functions introduce considerable memory overhead due to the addition of parameters, which increases their vulnerability to faults. Moreover, the heuristic-based optimization approach demands numerous fault injections during the search process, resulting in time-consuming threshold identification. On the other hand, learning-based techniques that train thresholds for entire layers concurrently often yield sub-optimal results. In this work, first, we demonstrate that it is not essential to incorporate neuron-wise activation functions throughout all layers in DNNs. Then, we propose a hybrid clipped activation function that integrates neuron-wise and layer-wise methods that apply neuron-wise clipping only in the last layer of DNNs. Additionally, to attain optimal thresholds in the clipping activation function, we introduce ProAct, a progressive training methodology. This approach iteratively trains the thresholds on a layer-by-layer basis, aiming to obtain optimal threshold values in each layer separately.
Related papers
- Activation-wise Propagation: A Universal Strategy to Break Timestep Constraints in Spiking Neural Networks for 3D Data Processing [29.279985043923386]
We introduce Activation-wise Membrane Potential Propagation (AMP2), a novel state update mechanism for spiking neurons.
Inspired by skip connections in deep networks, AMP2 incorporates the membrane potential of neurons into network, eliminating the need for iterative updates.
Our method achieves significant improvements across various 3D modalities, including 3D point clouds and event streams.
arXiv Detail & Related papers (2025-02-18T11:52:25Z) - ResQuNNs:Towards Enabling Deep Learning in Quantum Convolution Neural Networks [4.348591076994875]
We present a novel framework for enhancing the performance of Quanvolutional Neural Networks (QuNNs) by introducing trainable quanvolutional layers.
Our research overcomes this limitation by enabling training within these layers, significantly increasing the flexibility and potential of QuNNs.
We propose a novel architecture, Residual Quanvolutional Neural Networks (ResQuNNs), leveraging the concept of residual learning.
arXiv Detail & Related papers (2024-02-14T12:55:28Z) - Fully Spiking Actor Network with Intra-layer Connections for
Reinforcement Learning [51.386945803485084]
We focus on the task where the agent needs to learn multi-dimensional deterministic policies to control.
Most existing spike-based RL methods take the firing rate as the output of SNNs, and convert it to represent continuous action space (i.e., the deterministic policy) through a fully-connected layer.
To develop a fully spiking actor network without any floating-point matrix operations, we draw inspiration from the non-spiking interneurons found in insects.
arXiv Detail & Related papers (2024-01-09T07:31:34Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Neuro-Inspired Deep Neural Networks with Sparse, Strong Activations [11.707981310045742]
End-to-end training of Deep Neural Networks (DNNs) yields state of the art performance in an increasing array of applications.
We report here on a promising neuro-inspired approach to perturbations with sparser and stronger activations.
arXiv Detail & Related papers (2022-02-26T06:19:05Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - FitAct: Error Resilient Deep Neural Networks via Fine-Grained
Post-Trainable Activation Functions [0.05249805590164901]
Deep neural networks (DNNs) are increasingly being deployed in safety-critical systems such as personal healthcare devices and self-driving cars.
In this paper, we propose FitAct, a low-cost approach to enhance the error resilience of DNNs by deploying fine-grained post-trainable activation functions.
arXiv Detail & Related papers (2021-12-27T07:07:50Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z) - Temporal Spike Sequence Learning via Backpropagation for Deep Spiking
Neural Networks [14.992756670960008]
Spiking neural networks (SNNs) are well suited for computation and implementations on energy-efficient event-driven neuromorphic processors.
We present a novel Temporal Spike Sequence Learning Backpropagation (TSSL-BP) method for training deep SNNs.
arXiv Detail & Related papers (2020-02-24T05:49:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.