HierTrain: Fast Hierarchical Edge AI Learning with Hybrid Parallelism in
Mobile-Edge-Cloud Computing
- URL: http://arxiv.org/abs/2003.09876v1
- Date: Sun, 22 Mar 2020 12:40:06 GMT
- Title: HierTrain: Fast Hierarchical Edge AI Learning with Hybrid Parallelism in
Mobile-Edge-Cloud Computing
- Authors: Deyin Liu and Xu Chen and Zhi Zhou and Qing Ling
- Abstract summary: We propose HierTrain, a hierarchical edge AI learning framework, which efficiently deploys the DNN training task over the hierarchical MECC architecture.
We show that HierTrain can achieve up to 6.9x speedup compared to the cloud-based hierarchical training approach.
- Score: 36.40138484917463
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nowadays, deep neural networks (DNNs) are the core enablers for many emerging
edge AI applications. Conventional approaches to training DNNs are generally
implemented at central servers or cloud centers for centralized learning, which
is typically time-consuming and resource-demanding due to the transmission of a
large amount of data samples from the device to the remote cloud. To overcome
these disadvantages, we consider accelerating the learning process of DNNs on
the Mobile-Edge-Cloud Computing (MECC) paradigm. In this paper, we propose
HierTrain, a hierarchical edge AI learning framework, which efficiently deploys
the DNN training task over the hierarchical MECC architecture. We develop a
novel \textit{hybrid parallelism} method, which is the key to HierTrain, to
adaptively assign the DNN model layers and the data samples across the three
levels of edge device, edge server and cloud center. We then formulate the
problem of scheduling the DNN training tasks at both layer-granularity and
sample-granularity. Solving this optimization problem enables us to achieve the
minimum training time. We further implement a hardware prototype consisting of
an edge device, an edge server and a cloud server, and conduct extensive
experiments on it. Experimental results demonstrate that HierTrain can achieve
up to 6.9x speedup compared to the cloud-based hierarchical training approach.
Related papers
- AutoSynth: Learning to Generate 3D Training Data for Object Point Cloud
Registration [69.21282992341007]
Auto Synth automatically generates 3D training data for point cloud registration.
We replace the point cloud registration network with a much smaller surrogate network, leading to a $4056.43$ speedup.
Our results on TUD-L, LINEMOD and Occluded-LINEMOD evidence that a neural network trained on our searched dataset yields consistently better performance than the same one trained on the widely used ModelNet40 dataset.
arXiv Detail & Related papers (2023-09-20T09:29:44Z) - Transferability of Convolutional Neural Networks in Stationary Learning
Tasks [96.00428692404354]
We introduce a novel framework for efficient training of convolutional neural networks (CNNs) for large-scale spatial problems.
We show that a CNN trained on small windows of such signals achieves a nearly performance on much larger windows without retraining.
Our results show that the CNN is able to tackle problems with many hundreds of agents after being trained with fewer than ten.
arXiv Detail & Related papers (2023-07-21T13:51:45Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Training Spiking Neural Networks with Local Tandem Learning [96.32026780517097]
Spiking neural networks (SNNs) are shown to be more biologically plausible and energy efficient than their predecessors.
In this paper, we put forward a generalized learning rule, termed Local Tandem Learning (LTL)
We demonstrate rapid network convergence within five training epochs on the CIFAR-10 dataset while having low computational complexity.
arXiv Detail & Related papers (2022-10-10T10:05:00Z) - Designing and Training of Lightweight Neural Networks on Edge Devices
using Early Halting in Knowledge Distillation [16.74710649245842]
This paper presents a novel approach for designing and training lightweight Deep Neural Networks (DNN) on edge devices.
The approach considers the available storage, processing speed, and allowable maximum processing time.
We introduce a novel early halting technique, which preserves network resources.
arXiv Detail & Related papers (2022-09-30T16:18:24Z) - EffCNet: An Efficient CondenseNet for Image Classification on NXP
BlueBox [0.0]
Edge devices offer limited processing power due to their inexpensive hardware, and limited cooling and computational resources.
We propose a novel deep convolutional neural network architecture called EffCNet for edge devices.
arXiv Detail & Related papers (2021-11-28T21:32:31Z) - 3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low BitwidthQuantization,
and Ultra-Low Latency Acceleration [8.419854797930668]
Deep neural network (DNN) based AI applications on the edge require both low-cost computing platforms and high-quality services.
This paper emphasizes the importance of training, quantization and accelerator design, and calls for more research breakthroughs in the area for AI on the edge.
arXiv Detail & Related papers (2021-05-11T03:22:30Z) - CLAN: Continuous Learning using Asynchronous Neuroevolution on Commodity
Edge Devices [3.812706195714961]
We build a prototype distributed system of Raspberry Pis communicating via WiFi running NeuroEvolutionary (NE) learning and inference.
We evaluate the performance of such a collaborative system and detail the compute/communication characteristics of different arrangements of the system.
arXiv Detail & Related papers (2020-08-27T01:49:21Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.