AutoDiCE: Fully Automated Distributed CNN Inference at the Edge
- URL: http://arxiv.org/abs/2207.12113v1
- Date: Wed, 20 Jul 2022 15:08:52 GMT
- Title: AutoDiCE: Fully Automated Distributed CNN Inference at the Edge
- Authors: Xiaotian Guo and Andy D.Pimentel and Todor Stefanov
- Abstract summary: We propose a novel framework, called AutoDiCE, for automated splitting of a CNN model into a set of sub-models.
Our experimental results show that AutoDiCE can deliver distributed CNN inference with reduced energy consumption and memory usage per edge device.
- Score: 0.9883261192383613
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Learning approaches based on Convolutional Neural Networks (CNNs) are
extensively utilized and very successful in a wide range of application areas,
including image classification and speech recognition. For the execution of
trained CNNs, i.e. model inference, we nowadays witness a shift from the Cloud
to the Edge. Unfortunately, deploying and inferring large, compute and memory
intensive CNNs on edge devices is challenging because these devices typically
have limited power budgets and compute/memory resources. One approach to
address this challenge is to leverage all available resources across multiple
edge devices to deploy and execute a large CNN by properly partitioning the CNN
and running each CNN partition on a separate edge device. Although such
distribution, deployment, and execution of large CNNs on multiple edge devices
is a desirable and beneficial approach, there currently does not exist a design
and programming framework that takes a trained CNN model, together with a CNN
partitioning specification, and fully automates the CNN model splitting and
deployment on multiple edge devices to facilitate distributed CNN inference at
the Edge. Therefore, in this paper, we propose a novel framework, called
AutoDiCE, for automated splitting of a CNN model into a set of sub-models and
automated code generation for distributed and collaborative execution of these
sub-models on multiple, possibly heterogeneous, edge devices, while supporting
the exploitation of parallelism among and within the edge devices. Our
experimental results show that AutoDiCE can deliver distributed CNN inference
with reduced energy consumption and memory usage per edge device, and improved
overall system throughput at the same time.
Related papers
- Model Parallel Training and Transfer Learning for Convolutional Neural Networks by Domain Decomposition [0.0]
Deep convolutional neural networks (CNNs) have been shown to be very successful in a wide range of image processing applications.
Due to their increasing number of model parameters and an increasing availability of large amounts of training data, parallelization strategies to efficiently train complex CNNs are necessary.
arXiv Detail & Related papers (2024-08-26T17:35:01Z) - OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation [70.17681136234202]
We reexamine the design distinctions and test the limits of what a sparse CNN can achieve.
We propose two key components, i.e., adaptive receptive fields (spatially) and adaptive relation, to bridge the gap.
This exploration led to the creation of Omni-Adaptive 3D CNNs (OA-CNNs), a family of networks that integrates a lightweight module.
arXiv Detail & Related papers (2024-03-21T14:06:38Z) - Dynamic Semantic Compression for CNN Inference in Multi-access Edge
Computing: A Graph Reinforcement Learning-based Autoencoder [82.8833476520429]
We propose a novel semantic compression method, autoencoder-based CNN architecture (AECNN) for effective semantic extraction and compression in partial offloading.
In the semantic encoder, we introduce a feature compression module based on the channel attention mechanism in CNNs, to compress intermediate data by selecting the most informative features.
In the semantic decoder, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy.
arXiv Detail & Related papers (2024-01-19T15:19:47Z) - DietCNN: Multiplication-free Inference for Quantized CNNs [9.295702629926025]
This paper proposes a new method for replacing multiplications in a CNN by table look-ups.
It is shown that the proposed multiplication-free CNN, based on a single activation codebook, can achieve 4.7x, 5.6x, and 3.5x reduction in energy per inference.
arXiv Detail & Related papers (2023-05-09T08:54:54Z) - Attention-based Feature Compression for CNN Inference Offloading in Edge
Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems.
We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device.
Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z) - Towards a General Purpose CNN for Long Range Dependencies in
$\mathrm{N}$D [49.57261544331683]
We propose a single CNN architecture equipped with continuous convolutional kernels for tasks on arbitrary resolution, dimensionality and length without structural changes.
We show the generality of our approach by applying the same CCNN to a wide set of tasks on sequential (1$mathrmD$) and visual data (2$mathrmD$)
Our CCNN performs competitively and often outperforms the current state-of-the-art across all tasks considered.
arXiv Detail & Related papers (2022-06-07T15:48:02Z) - EffCNet: An Efficient CondenseNet for Image Classification on NXP
BlueBox [0.0]
Edge devices offer limited processing power due to their inexpensive hardware, and limited cooling and computational resources.
We propose a novel deep convolutional neural network architecture called EffCNet for edge devices.
arXiv Detail & Related papers (2021-11-28T21:32:31Z) - The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer.
Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z) - MGIC: Multigrid-in-Channels Neural Network Architectures [8.459177309094688]
We present a multigrid-in-channels approach that tackles the quadratic growth of the number of parameters with respect to the number of channels in standard convolutional neural networks (CNNs)
Our approach addresses the redundancy in CNNs that is also exposed by the recent success of lightweight CNNs.
arXiv Detail & Related papers (2020-11-17T11:29:10Z) - How Secure is Distributed Convolutional Neural Network on IoT Edge
Devices? [0.0]
We propose Trojan attacks on CNN deployed across a distributed edge network across different nodes.
These attacks are tested on deep learning models (LeNet, AlexNet)
arXiv Detail & Related papers (2020-06-16T16:10:09Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.