Intermittent Inference with Nonuniformly Compressed Multi-Exit Neural
Network for Energy Harvesting Powered Devices
- URL: http://arxiv.org/abs/2004.11293v2
- Date: Thu, 23 Jul 2020 17:18:03 GMT
- Title: Intermittent Inference with Nonuniformly Compressed Multi-Exit Neural
Network for Energy Harvesting Powered Devices
- Authors: Yawen Wu, Zhepeng Wang, Zhenge Jia, Yiyu Shi, Jingtong Hu
- Abstract summary: This work aims to enable persistent, event-driven sensing and decision capabilities for energy-harvesting (EH)-powered devices.
We developed a power trace-aware and exit-guided network compression algorithm to compress and deploy multi-exit neural networks to EH-powered microcontrollers.
Experiments show superior accuracy and latency compared with state-of-the-art techniques.
- Score: 17.165614326127287
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work aims to enable persistent, event-driven sensing and decision
capabilities for energy-harvesting (EH)-powered devices by deploying
lightweight DNNs onto EH-powered devices. However, harvested energy is usually
weak and unpredictable and even lightweight DNNs take multiple power cycles to
finish one inference. To eliminate the indefinite long wait to accumulate
energy for one inference and to optimize the accuracy, we developed a power
trace-aware and exit-guided network compression algorithm to compress and
deploy multi-exit neural networks to EH-powered microcontrollers (MCUs) and
select exits during execution according to available energy. The experimental
results show superior accuracy and latency compared with state-of-the-art
techniques.
Related papers
- Energy-Aware Dynamic Neural Inference [39.04688735618206]
We introduce an on-device adaptive inference system equipped with an energy-harvester and finite-capacity energy storage.
We show that, as the rate of the ambient energy increases, energy- and confidence-aware control schemes show approximately 5% improvement in accuracy.
We derive a principled policy with theoretical guarantees for confidence-aware and -agnostic controllers.
arXiv Detail & Related papers (2024-11-04T16:51:22Z) - Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision
Quantization [1.0235078178220354]
We propose an automated framework to compress Deep Neural Networks (DNNs) in a hardware-aware manner by jointly employing pruning and quantization.
Our framework achieves $39%$ average energy reduction for datasets $1.7%$ average accuracy loss and outperforms significantly the state-of-the-art approaches.
arXiv Detail & Related papers (2023-12-23T18:50:13Z) - PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices [10.01838504586422]
The continuous operation of ML-powered systems leads to significant energy use during inference.
This paper investigates how the configuration of on-device hardware-elements such as GPU, memory, and CPU frequency, affects energy consumption for NN inference with regular fine-tuning.
We propose PolyThrottle, a solution that optimize configurations across individual hardware components using Constrained Bayesian Optimization in an energy-conserving manner.
arXiv Detail & Related papers (2023-10-30T20:19:41Z) - Multiagent Reinforcement Learning with an Attention Mechanism for
Improving Energy Efficiency in LoRa Networks [52.96907334080273]
As the network scale increases, the energy efficiency of LoRa networks decreases sharply due to severe packet collisions.
We propose a transmission parameter allocation algorithm based on multiagent reinforcement learning (MALoRa)
Simulation results demonstrate that MALoRa significantly improves the system EE compared with baseline algorithms.
arXiv Detail & Related papers (2023-09-16T11:37:23Z) - Multi-Objective Optimization for UAV Swarm-Assisted IoT with Virtual
Antenna Arrays [55.736718475856726]
Unmanned aerial vehicle (UAV) network is a promising technology for assisting Internet-of-Things (IoT)
Existing UAV-assisted data harvesting and dissemination schemes require UAVs to frequently fly between the IoTs and access points.
We introduce collaborative beamforming into IoTs and UAVs simultaneously to achieve energy and time-efficient data harvesting and dissemination.
arXiv Detail & Related papers (2023-08-03T02:49:50Z) - PowerPruning: Selecting Weights and Activations for Power-Efficient
Neural Network Acceleration [8.72556779535502]
We propose a novel method to reduce power consumption in digital neural network accelerators by selecting weights that lead to less power consumption in MAC operations.
Together with retraining, the proposed method can reduce power consumption of DNNs on hardware by up to 78.3% with only a slight accuracy loss.
arXiv Detail & Related papers (2023-03-24T13:52:07Z) - Distributed Energy Management and Demand Response in Smart Grids: A
Multi-Agent Deep Reinforcement Learning Framework [53.97223237572147]
This paper presents a multi-agent Deep Reinforcement Learning (DRL) framework for autonomous control and integration of renewable energy resources into smart power grid systems.
In particular, the proposed framework jointly considers demand response (DR) and distributed energy management (DEM) for residential end-users.
arXiv Detail & Related papers (2022-11-29T01:18:58Z) - Enabling Super-Fast Deep Learning on Tiny Energy-Harvesting IoT Devices [3.070669432211866]
Energy harvesting devices operate intermittently without batteries.
implementing memory-intensive algorithms on EH devices is extremely difficult due to limited resources and intermittent power supply.
This paper proposes a methodology that enables super-fast deep learning with low-energy accelerators for tiny energy harvesting devices.
arXiv Detail & Related papers (2021-11-28T04:55:41Z) - Energy-Efficient Model Compression and Splitting for Collaborative
Inference Over Time-Varying Channels [52.60092598312894]
We propose a technique to reduce the total energy bill at the edge device by utilizing model compression and time-varying model split between the edge and remote nodes.
Our proposed solution results in minimal energy consumption and $CO$ emission compared to the considered baselines.
arXiv Detail & Related papers (2021-06-02T07:36:27Z) - InstantNet: Automated Generation and Deployment of Instantaneously
Switchable-Precision Networks [65.78061366594106]
We propose InstantNet to automatically generate and deploy instantaneously switchable-precision networks which operate at variable bit-widths.
In experiments, the proposed InstantNet consistently outperforms state-of-the-art designs.
arXiv Detail & Related papers (2021-04-22T04:07:43Z) - EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware
Multi-Task NLP Inference [82.1584439276834]
Transformer-based language models such as BERT provide significant accuracy improvement for a multitude of natural language processing (NLP) tasks.
We present EdgeBERT, an in-depth algorithm- hardware co-design for latency-aware energy optimization for multi-task NLP.
arXiv Detail & Related papers (2020-11-28T19:21:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.