AVAC: A Machine Learning based Adaptive RRAM Variability-Aware
Controller for Edge Devices
- URL: http://arxiv.org/abs/2005.03077v1
- Date: Wed, 6 May 2020 19:06:51 GMT
- Title: AVAC: A Machine Learning based Adaptive RRAM Variability-Aware
Controller for Edge Devices
- Authors: Shikhar Tuli and Shreshth Tuli
- Abstract summary: We propose an Adaptive RRAM Variability-Aware Controller, AVAC, which periodically updates Wait Buffer and batch sizes.
AVAC allows Edge devices to adapt to different applications and their stages, to improve performance and reduce energy consumption.
- Score: 3.7346292069282643
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, the Edge Computing paradigm has gained significant popularity both
in industry and academia. Researchers now increasingly target to improve
performance and reduce energy consumption of such devices. Some recent efforts
focus on using emerging RRAM technologies for improving energy efficiency,
thanks to their no leakage property and high integration density. As the
complexity and dynamism of applications supported by such devices escalate, it
has become difficult to maintain ideal performance by static RRAM controllers.
Machine Learning provides a promising solution for this, and hence, this work
focuses on extending such controllers to allow dynamic parameter updates. In
this work we propose an Adaptive RRAM Variability-Aware Controller, AVAC, which
periodically updates Wait Buffer and batch sizes using on-the-fly learning
models and gradient ascent. AVAC allows Edge devices to adapt to different
applications and their stages, to improve computation performance and reduce
energy consumption. Simulations demonstrate that the proposed model can provide
up to 29% increase in performance and 19% decrease in energy, compared to
static controllers, using traces of real-life healthcare applications on a
Raspberry-Pi based Edge deployment.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Enhancing User Experience in On-Device Machine Learning with Gated Compression Layers [0.0]
On-device machine learning (ODML) enables powerful edge applications, but power consumption remains a key challenge for resource-constrained devices.
This work focuses on the use of Gated Compression (GC) layer to enhance ODML model performance while conserving power.
GC layers dynamically regulate data flow by selectively gating activations of neurons within the neural network and effectively filtering out non-essential inputs.
arXiv Detail & Related papers (2024-05-02T21:18:06Z) - Deep Learning Models in Speech Recognition: Measuring GPU Energy Consumption, Impact of Noise and Model Quantization for Edge Deployment [0.0]
This study examines the effects of quantization, memory demands, and energy consumption on the performance of various ASR model inference on the NVIDIA Jetson Orin Nano.
We found that changing precision from fp32 to fp16 halves the energy consumption for audio transcription across different models, with minimal performance degradation.
arXiv Detail & Related papers (2024-05-02T05:09:07Z) - Simulating Battery-Powered TinyML Systems Optimised using Reinforcement Learning in Image-Based Anomaly Detection [0.0]
This work extends and contributes to TinyML research by optimising battery-powered image-based anomaly detection Internet of Things (IoT) systems.
The proposed solution can be deployed to resource-constrained hardware, given its low memory footprint of 800 B.
This further facilitates the real-world deployment of such systems, including key sectors such as smart agriculture.
arXiv Detail & Related papers (2024-03-08T07:09:56Z) - etuner: A Redundancy-Aware Framework for Efficient Continual Learning Application on Edge Devices [47.365775210055396]
We propose ETuner, an efficient edge continual learning framework that optimize inference accuracy, fine-tuning execution time, and energy efficiency.
Experimental results show that, on average, ETuner reduces overall fine-tuning execution time by 64%, energy consumption by 56%, and improves average inference accuracy by 1.75% over the immediate model fine-tuning approach.
arXiv Detail & Related papers (2024-01-30T02:41:05Z) - TransCODE: Co-design of Transformers and Accelerators for Efficient
Training and Inference [6.0093441900032465]
We propose a framework that simulates transformer inference and training on a design space of accelerators.
We use this simulator in conjunction with the proposed co-design technique, called TransCODE, to obtain the best-performing models.
The obtained transformer-accelerator pair achieves 0.3% higher accuracy than the state-of-the-art pair.
arXiv Detail & Related papers (2023-03-27T02:45:18Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - LEAF + AIO: Edge-Assisted Energy-Aware Object Detection for Mobile
Augmented Reality [77.00418462388525]
Mobile augmented reality (MAR) applications are significantly energy-guzzling.
We design an edge-based energy-aware MAR system that enables MAR devices to dynamically change their configurations.
Our proposed dynamic MAR configuration adaptations can minimize the per frame energy consumption of multiple MAR clients.
arXiv Detail & Related papers (2022-05-27T06:11:50Z) - Hardware-Robust In-RRAM-Computing for Object Detection [0.15113576014047125]
In-RRAM computing suffered from large device variation and numerous nonideal effects in hardware.
This paper proposes a joint hardware and software optimization strategy to design a hardware-robust IRC macro for object detection.
The proposed approach has been successfully applied to a complex object detection task with only 3.85% mAP drop.
arXiv Detail & Related papers (2022-05-09T01:46:24Z) - Improving Computational Efficiency in Visual Reinforcement Learning via
Stored Embeddings [89.63764845984076]
We present Stored Embeddings for Efficient Reinforcement Learning (SEER)
SEER is a simple modification of existing off-policy deep reinforcement learning methods.
We show that SEER does not degrade the performance of RLizable agents while significantly saving computation and memory.
arXiv Detail & Related papers (2021-03-04T08:14:10Z) - SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and
Training [82.35376405568975]
Deep neural networks (DNNs) come with heavy parameterization, leading to external dynamic random-access memory (DRAM) for storage.
We present SmartDeal (SD), an algorithm framework to trade higher-cost memory storage/access for lower-cost computation.
We show that SD leads to 10.56x and 4.48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.
arXiv Detail & Related papers (2021-01-04T18:54:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.