Related papers: The Effects of Partitioning Strategies on Energy Consumption in Distributed CNN Inference at The Edge

The Effects of Partitioning Strategies on Energy Consumption in Distributed CNN Inference at The Edge

URL: http://arxiv.org/abs/2210.08392v1
Date: Sat, 15 Oct 2022 22:54:02 GMT
Title: The Effects of Partitioning Strategies on Energy Consumption in Distributed CNN Inference at The Edge
Authors: Erqian Tang, Xiaotian Guo, Todor Stefanov
Abstract summary: Many AI applications require Convolutional Neural Network (CNN) inference on a distributed system at the edge. There are four main partitioning strategies that can be utilized to partition a large CNN model and perform distributed CNN inference on multiple devices at the edge. In this paper, we investigate and compare the per-device energy consumption of CNN model inference at the edge on a distributed system when the four partitioning strategies are utilized.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Nowadays, many AI applications utilizing resource-constrained edge devices (e.g., small mobile robots, tiny IoT devices, etc.) require Convolutional Neural Network (CNN) inference on a distributed system at the edge due to limited resources of a single edge device to accommodate and execute a large CNN. There are four main partitioning strategies that can be utilized to partition a large CNN model and perform distributed CNN inference on multiple devices at the edge. However, to the best of our knowledge, no research has been conducted to investigate how these four partitioning strategies affect the energy consumption per edge device. Such an investigation is important because it will reveal the potential of these partitioning strategies to be used effectively for reduction of the per-device energy consumption when a large CNN model is deployed for distributed inference at the edge. Therefore, in this paper, we investigate and compare the per-device energy consumption of CNN model inference at the edge on a distributed system when the four partitioning strategies are utilized. The goal of our investigation and comparison is to find out which partitioning strategies (and under what conditions) have the highest potential to decrease the energy consumption per edge device when CNN inference is performed at the edge on a distributed system.

Related papers

Reducing Inference Energy Consumption Using Dual Complementary CNNs [13.783950035836593]
We propose a novel approach to reduce the energy requirements of inference of CNNs. We employ two small Complementary CNNs that collaborate with each other by covering each other's "weaknesses" in predictions. Our experiments on a Jetson Nano computer demonstrate an energy reduction of up to 85.8% achieved on modified datasets where each sample was duplicated once.
arXiv Detail & Related papers (2024-12-02T01:46:07Z)
I-SplitEE: Image classification in Split Computing DNNs with Early Exits [5.402030962296633]
Large size of Deep Neural Networks (DNNs) hinders deploying them on resource-constrained devices like edge, mobile, and IoT platforms. Our work presents an innovative unified approach merging early exits and split computing. I-SplitEE is an online unsupervised algorithm ideal for scenarios lacking ground truths and with sequential data.
arXiv Detail & Related papers (2024-01-19T07:44:32Z)
Deep Convolutional Neural Networks for Short-Term Multi-Energy Demand Prediction of Integrated Energy Systems [49.1574468325115]
This paper develops six novel prediction models based on Convolutional Neural Networks (CNNs) for forecasting multi-energy power consumptions. The models are applied in a comprehensive manner on a novel integrated electrical, heat and gas network system.
arXiv Detail & Related papers (2023-12-24T14:56:23Z)
Attention-based Feature Compression for CNN Inference Offloading in Edge Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems. We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device. Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z)
Receptive Field-based Segmentation for Distributed CNN Inference Acceleration in Collaborative Edge Computing [93.67044879636093]
We study inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing network. We propose a novel collaborative edge computing using fused-layer parallelization to partition a CNN model into multiple blocks of convolutional layers.
arXiv Detail & Related papers (2022-07-22T18:38:11Z)
Dynamic Split Computing for Efficient Deep Edge Intelligence [78.4233915447056]
We introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel. We show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.
arXiv Detail & Related papers (2022-05-23T12:35:18Z)
SplitPlace: AI Augmented Splitting and Placement of Large-Scale Neural Networks in Mobile Edge Environments [13.864161788250856]
This work proposes an AI-driven online policy, SplitPlace, that uses Multi-Armed-Bandits to intelligently decide between layer and semantic splitting strategies. SplitPlace places such neural network split fragments on mobile edge devices using decision-aware reinforcement learning. Our experiments show that SplitPlace can significantly improve the state-of-the-art in terms of average response time, deadline violation rate, inference accuracy, and total reward by up to 46, 69, 3 and 12 percent respectively.
arXiv Detail & Related papers (2022-05-21T16:24:47Z)
Survey on Large Scale Neural Network Training [48.424512364338746]
Modern Deep Neural Networks (DNNs) require significant memory to store weight, activations, and other intermediate tensors during training. This survey provides a systematic overview of the approaches that enable more efficient DNNs training.
arXiv Detail & Related papers (2022-02-21T18:48:02Z)
Communication-Efficient Separable Neural Network for Distributed Inference on Edge Devices [2.28438857884398]
We propose a novel method of exploiting model parallelism to separate a neural network for distributed inferences. Under proper specifications of devices and configurations of models, our experiments show that the inference of large neural networks on edge clusters can be distributed and accelerated.
arXiv Detail & Related papers (2021-11-03T19:30:28Z)
Computational Intelligence and Deep Learning for Next-Generation Edge-Enabled Industrial IoT [51.68933585002123]
We investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks. In this paper, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework. In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.
arXiv Detail & Related papers (2021-10-28T08:14:57Z)
Latency-Memory Optimized Splitting of Convolution Neural Networks for Resource Constrained Edge Devices [1.6873748786804317]
We argue that running CNNs between an edge device and the cloud is synonymous to solving a resource-constrained optimization problem. Experiments done on real-world edge devices show that, LMOS ensures feasible execution of different CNN models at the edge.
arXiv Detail & Related papers (2021-07-19T19:39:56Z)
Energy-Efficient Model Compression and Splitting for Collaborative Inference Over Time-Varying Channels [52.60092598312894]
We propose a technique to reduce the total energy bill at the edge device by utilizing model compression and time-varying model split between the edge and remote nodes. Our proposed solution results in minimal energy consumption and $CO$ emission compared to the considered baselines.
arXiv Detail & Related papers (2021-06-02T07:36:27Z)
CoEdge: Cooperative DNN Inference with Adaptive Workload Partitioning over Heterogeneous Edge Devices [39.09319776243573]
CoEdge is a distributed Deep Neural Network (DNN) computing system that orchestrates cooperative inference over heterogeneous edge devices. CoEdge saves energy with close inference latency, achieving up to 25.5%66.9% energy reduction for four widely-adopted CNN models.
arXiv Detail & Related papers (2020-12-06T13:15:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.