The Effects of Partitioning Strategies on Energy Consumption in
Distributed CNN Inference at The Edge
- URL: http://arxiv.org/abs/2210.08392v1
- Date: Sat, 15 Oct 2022 22:54:02 GMT
- Title: The Effects of Partitioning Strategies on Energy Consumption in
Distributed CNN Inference at The Edge
- Authors: Erqian Tang, Xiaotian Guo, Todor Stefanov
- Abstract summary: Many AI applications require Convolutional Neural Network (CNN) inference on a distributed system at the edge.
There are four main partitioning strategies that can be utilized to partition a large CNN model and perform distributed CNN inference on multiple devices at the edge.
In this paper, we investigate and compare the per-device energy consumption of CNN model inference at the edge on a distributed system when the four partitioning strategies are utilized.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Nowadays, many AI applications utilizing resource-constrained edge devices
(e.g., small mobile robots, tiny IoT devices, etc.) require Convolutional
Neural Network (CNN) inference on a distributed system at the edge due to
limited resources of a single edge device to accommodate and execute a large
CNN. There are four main partitioning strategies that can be utilized to
partition a large CNN model and perform distributed CNN inference on multiple
devices at the edge. However, to the best of our knowledge, no research has
been conducted to investigate how these four partitioning strategies affect the
energy consumption per edge device. Such an investigation is important because
it will reveal the potential of these partitioning strategies to be used
effectively for reduction of the per-device energy consumption when a large CNN
model is deployed for distributed inference at the edge. Therefore, in this
paper, we investigate and compare the per-device energy consumption of CNN
model inference at the edge on a distributed system when the four partitioning
strategies are utilized. The goal of our investigation and comparison is to
find out which partitioning strategies (and under what conditions) have the
highest potential to decrease the energy consumption per edge device when CNN
inference is performed at the edge on a distributed system.
Related papers
- Reducing Inference Energy Consumption Using Dual Complementary CNNs [13.783950035836593]
We propose a novel approach to reduce the energy requirements of inference of CNNs.
We employ two small Complementary CNNs that collaborate with each other by covering each other's "weaknesses" in predictions.
Our experiments on a Jetson Nano computer demonstrate an energy reduction of up to 85.8% achieved on modified datasets where each sample was duplicated once.
arXiv Detail & Related papers (2024-12-02T01:46:07Z) - I-SplitEE: Image classification in Split Computing DNNs with Early Exits [5.402030962296633]
Large size of Deep Neural Networks (DNNs) hinders deploying them on resource-constrained devices like edge, mobile, and IoT platforms.
Our work presents an innovative unified approach merging early exits and split computing.
I-SplitEE is an online unsupervised algorithm ideal for scenarios lacking ground truths and with sequential data.
arXiv Detail & Related papers (2024-01-19T07:44:32Z) - Attention-based Feature Compression for CNN Inference Offloading in Edge
Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems.
We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device.
Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z) - Receptive Field-based Segmentation for Distributed CNN Inference
Acceleration in Collaborative Edge Computing [93.67044879636093]
We study inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing network.
We propose a novel collaborative edge computing using fused-layer parallelization to partition a CNN model into multiple blocks of convolutional layers.
arXiv Detail & Related papers (2022-07-22T18:38:11Z) - Dynamic Split Computing for Efficient Deep Edge Intelligence [78.4233915447056]
We introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel.
We show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.
arXiv Detail & Related papers (2022-05-23T12:35:18Z) - SplitPlace: AI Augmented Splitting and Placement of Large-Scale Neural
Networks in Mobile Edge Environments [13.864161788250856]
This work proposes an AI-driven online policy, SplitPlace, that uses Multi-Armed-Bandits to intelligently decide between layer and semantic splitting strategies.
SplitPlace places such neural network split fragments on mobile edge devices using decision-aware reinforcement learning.
Our experiments show that SplitPlace can significantly improve the state-of-the-art in terms of average response time, deadline violation rate, inference accuracy, and total reward by up to 46, 69, 3 and 12 percent respectively.
arXiv Detail & Related papers (2022-05-21T16:24:47Z) - Survey on Large Scale Neural Network Training [48.424512364338746]
Modern Deep Neural Networks (DNNs) require significant memory to store weight, activations, and other intermediate tensors during training.
This survey provides a systematic overview of the approaches that enable more efficient DNNs training.
arXiv Detail & Related papers (2022-02-21T18:48:02Z) - Communication-Efficient Separable Neural Network for Distributed
Inference on Edge Devices [2.28438857884398]
We propose a novel method of exploiting model parallelism to separate a neural network for distributed inferences.
Under proper specifications of devices and configurations of models, our experiments show that the inference of large neural networks on edge clusters can be distributed and accelerated.
arXiv Detail & Related papers (2021-11-03T19:30:28Z) - Computational Intelligence and Deep Learning for Next-Generation
Edge-Enabled Industrial IoT [51.68933585002123]
We investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks.
In this paper, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework.
In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.
arXiv Detail & Related papers (2021-10-28T08:14:57Z) - Energy-Efficient Model Compression and Splitting for Collaborative
Inference Over Time-Varying Channels [52.60092598312894]
We propose a technique to reduce the total energy bill at the edge device by utilizing model compression and time-varying model split between the edge and remote nodes.
Our proposed solution results in minimal energy consumption and $CO$ emission compared to the considered baselines.
arXiv Detail & Related papers (2021-06-02T07:36:27Z) - CoEdge: Cooperative DNN Inference with Adaptive Workload Partitioning
over Heterogeneous Edge Devices [39.09319776243573]
CoEdge is a distributed Deep Neural Network (DNN) computing system that orchestrates cooperative inference over heterogeneous edge devices.
CoEdge saves energy with close inference latency, achieving up to 25.5%66.9% energy reduction for four widely-adopted CNN models.
arXiv Detail & Related papers (2020-12-06T13:15:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.