Uncertainty Modeling of Emerging Device-based Computing-in-Memory Neural
Accelerators with Application to Neural Architecture Search
- URL: http://arxiv.org/abs/2107.06871v1
- Date: Tue, 6 Jul 2021 23:29:36 GMT
- Title: Uncertainty Modeling of Emerging Device-based Computing-in-Memory Neural
Accelerators with Application to Neural Architecture Search
- Authors: Zheyu Yan, Da-Cheng Juan, Xiaobo Sharon Hu, Yiyu Shi
- Abstract summary: Emerging device-based Computing-in-memory (CiM) has been proved to be a promising candidate for high-energy efficiency deep neural network (DNN) computations.
Most emerging devices suffer uncertainty issues, resulting in a difference between actual data stored and the weight value it is designed to be.
This leads to an accuracy drop from trained models to actually deployed platforms.
- Score: 25.841113960607334
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Emerging device-based Computing-in-memory (CiM) has been proved to be a
promising candidate for high-energy efficiency deep neural network (DNN)
computations. However, most emerging devices suffer uncertainty issues,
resulting in a difference between actual data stored and the weight value it is
designed to be. This leads to an accuracy drop from trained models to actually
deployed platforms. In this work, we offer a thorough analysis of the effect of
such uncertainties-induced changes in DNN models. To reduce the impact of
device uncertainties, we propose UAE, an uncertainty-aware Neural Architecture
Search scheme to identify a DNN model that is both accurate and robust against
device uncertainties.
Related papers
- Designing DNNs for a trade-off between robustness and processing performance in embedded devices [1.474723404975345]
Machine learning-based embedded systems need to be robust against soft errors.
This paper investigates the suitability of using bounded AFs to improve model robustness against perturbations.
We analyze encoder-decoder fully convolutional models aimed at performing semantic segmentation tasks on hyperspectral images for scene understanding in autonomous driving.
arXiv Detail & Related papers (2024-12-04T19:34:33Z) - Evaluating Single Event Upsets in Deep Neural Networks for Semantic Segmentation: an embedded system perspective [1.474723404975345]
This paper delves into the robustness assessment in embedded Deep Neural Networks (DNNs)
By scrutinizing the layer-by-layer and bit-by-bit sensitivity of various encoder-decoder models to soft errors, this study thoroughly investigates the vulnerability of segmentation DNNs to SEUs.
We propose a set of practical lightweight error mitigation techniques with no memory or computational cost suitable for resource-constrained deployments.
arXiv Detail & Related papers (2024-12-04T18:28:38Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - An Investigation on Machine Learning Predictive Accuracy Improvement and Uncertainty Reduction using VAE-based Data Augmentation [2.517043342442487]
Deep generative learning uses certain ML models to learn the underlying distribution of existing data and generate synthetic samples that resemble the real data.
In this study, our objective is to evaluate the effectiveness of data augmentation using variational autoencoder (VAE)-based deep generative models.
We investigated whether the data augmentation leads to improved accuracy in the predictions of a deep neural network (DNN) model trained using the augmented data.
arXiv Detail & Related papers (2024-10-24T18:15:48Z) - Compute-in-Memory based Neural Network Accelerators for Safety-Critical
Systems: Worst-Case Scenarios and Protections [8.813981342105151]
We study the problem of pinpointing the worst-case performance of CiM accelerators affected by device variations.
We propose a novel worst-case-aware training technique named A-TRICE that efficiently combines adversarial training and noise-injection training.
Our experimental results demonstrate that A-TRICE improves the worst-case accuracy under device variations by up to 33%.
arXiv Detail & Related papers (2023-12-11T05:56:00Z) - Uncertainty-aware deep learning for digital twin-driven monitoring:
Application to fault detection in power lines [0.0]
Deep neural networks (DNNs) are often coupled with physics-based models or data-driven surrogate models to perform fault detection and health monitoring of systems in the low data regime.
These models can exhibit parametric uncertainty that propagates to the generated data.
In this article, we quantify the impact of both these sources of uncertainty on the performance of the DNN.
arXiv Detail & Related papers (2023-03-20T09:27:58Z) - Fault-Aware Design and Training to Enhance DNNs Reliability with
Zero-Overhead [67.87678914831477]
Deep Neural Networks (DNNs) enable a wide series of technological advancements.
Recent findings indicate that transient hardware faults may corrupt the models prediction dramatically.
In this work, we propose to tackle the reliability issue both at training and model design time.
arXiv Detail & Related papers (2022-05-28T13:09:30Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - On the benefits of robust models in modulation recognition [53.391095789289736]
Deep Neural Networks (DNNs) using convolutional layers are state-of-the-art in many tasks in communications.
In other domains, like image classification, DNNs have been shown to be vulnerable to adversarial perturbations.
We propose a novel framework to test the robustness of current state-of-the-art models.
arXiv Detail & Related papers (2021-03-27T19:58:06Z) - Contextual-Bandit Anomaly Detection for IoT Data in Distributed
Hierarchical Edge Computing [65.78881372074983]
IoT devices can hardly afford complex deep neural networks (DNN) models, and offloading anomaly detection tasks to the cloud incurs long delay.
We propose and build a demo for an adaptive anomaly detection approach for distributed hierarchical edge computing (HEC) systems.
We show that our proposed approach significantly reduces detection delay without sacrificing accuracy, as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-04-15T06:13:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.