Variability-Aware Training and Self-Tuning of Highly Quantized DNNs for
Analog PIM
- URL: http://arxiv.org/abs/2111.06457v1
- Date: Thu, 11 Nov 2021 20:55:02 GMT
- Title: Variability-Aware Training and Self-Tuning of Highly Quantized DNNs for
Analog PIM
- Authors: Zihao Deng and Michael Orshansky
- Abstract summary: We develop a new joint variability- and quantization-aware DNN training algorithm for highly quantized analog PIM-based models.
For low-bitwidth models and high variation, the gain in accuracy is up to 35.7% for ResNet-18 over the best alternative.
- Score: 0.15229257192293197
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: DNNs deployed on analog processing in memory (PIM) architectures are subject
to fabrication-time variability. We developed a new joint variability- and
quantization-aware DNN training algorithm for highly quantized analog PIM-based
models that is significantly more effective than prior work. It outperforms
variability-oblivious and post-training quantized models on multiple computer
vision datasets/models. For low-bitwidth models and high variation, the gain in
accuracy is up to 35.7% for ResNet-18 over the best alternative.
We demonstrate that, under a realistic pattern of within- and between-chip
components of variability, training alone is unable to prevent large DNN
accuracy loss (of up to 54% on CIFAR-100/ResNet-18). We introduce a self-tuning
DNN architecture that dynamically adjusts layer-wise activations during
inference and is effective in reducing accuracy loss to below 10%.
Related papers
- Scalable Mechanistic Neural Networks [52.28945097811129]
We propose an enhanced neural network framework designed for scientific machine learning applications involving long temporal sequences.
By reformulating the original Mechanistic Neural Network (MNN) we reduce the computational time and space complexities from cubic and quadratic with respect to the sequence length, respectively, to linear.
Extensive experiments demonstrate that S-MNN matches the original MNN in precision while substantially reducing computational resources.
arXiv Detail & Related papers (2024-10-08T14:27:28Z) - Enhancing Deep Neural Network Training Efficiency and Performance through Linear Prediction [0.0]
Deep neural networks (DNN) have achieved remarkable success in various fields, including computer vision and natural language processing.
This paper aims to propose a method to optimize the training effectiveness of DNN, with the goal of improving model performance.
arXiv Detail & Related papers (2023-10-17T03:11:30Z) - Negative Feedback Training: A Novel Concept to Improve Robustness of NVCIM DNN Accelerators [11.832487701641723]
Non-volatile memory (NVM) devices excel in energy efficiency and latency when performing Deep Neural Network (DNN) inference.
We propose a novel training concept: Negative Feedback Training (NFT) leveraging the multi-scale noisy information captured from network.
Our methods outperform existing state-of-the-art methods with up to a 46.71% improvement in inference accuracy.
arXiv Detail & Related papers (2023-05-23T22:56:26Z) - Dual adaptive training of photonic neural networks [30.86507809437016]
Photonic neural network (PNN) computes with photons instead of electrons to feature low latency, high energy efficiency, and high parallelism.
Existing training approaches cannot address the extensive accumulation of systematic errors in large-scale PNNs.
We propose dual adaptive training ( DAT) that allows the PNN model to adapt to substantial systematic errors.
arXiv Detail & Related papers (2022-12-09T05:03:45Z) - Edge Inference with Fully Differentiable Quantized Mixed Precision
Neural Networks [1.131071436917293]
Quantizing parameters and operations to lower bit-precision offers substantial memory and energy savings for neural network inference.
This paper proposes a new quantization approach for mixed precision convolutional neural networks (CNNs) targeting edge-computing.
arXiv Detail & Related papers (2022-06-15T18:11:37Z) - Fault-Aware Design and Training to Enhance DNNs Reliability with
Zero-Overhead [67.87678914831477]
Deep Neural Networks (DNNs) enable a wide series of technological advancements.
Recent findings indicate that transient hardware faults may corrupt the models prediction dramatically.
In this work, we propose to tackle the reliability issue both at training and model design time.
arXiv Detail & Related papers (2022-05-28T13:09:30Z) - Low-Precision Training in Logarithmic Number System using Multiplicative
Weight Update [49.948082497688404]
Training large-scale deep neural networks (DNNs) currently requires a significant amount of energy, leading to serious environmental impacts.
One promising approach to reduce the energy costs is representing DNNs with low-precision numbers.
We jointly design a lowprecision training framework involving a logarithmic number system (LNS) and a multiplicative weight update training method, termed LNS-Madam.
arXiv Detail & Related papers (2021-06-26T00:32:17Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - FracTrain: Fractionally Squeezing Bit Savings Both Temporally and
Spatially for Efficient DNN Training [81.85361544720885]
We propose FracTrain that integrates progressive fractional quantization which gradually increases the precision of activations, weights, and gradients.
FracTrain reduces computational cost and hardware-quantified energy/latency of DNN training while achieving a comparable or better (-0.12%+1.87%) accuracy.
arXiv Detail & Related papers (2020-12-24T05:24:10Z) - EMPIR: Ensembles of Mixed Precision Deep Networks for Increased
Robustness against Adversarial Attacks [18.241639570479563]
Deep Neural Networks (DNNs) are vulnerable to adversarial attacks in which small input perturbations can produce catastrophic misclassifications.
We propose EMPIR, ensembles of quantized DNN models with different numerical precisions, as a new approach to increase robustness against adversarial attacks.
Our results indicate that EMPIR boosts the average adversarial accuracies by 42.6%, 15.2% and 10.5% for the DNN models trained on the MNIST, CIFAR-10 and ImageNet datasets respectively.
arXiv Detail & Related papers (2020-04-21T17:17:09Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.