JMSNAS: Joint Model Split and Neural Architecture Search for Learning
over Mobile Edge Networks
- URL: http://arxiv.org/abs/2111.08206v1
- Date: Tue, 16 Nov 2021 03:10:23 GMT
- Title: JMSNAS: Joint Model Split and Neural Architecture Search for Learning
over Mobile Edge Networks
- Authors: Yuqing Tian, Zhaoyang Zhang, Zhaohui Yang, Qianqian Yang
- Abstract summary: Joint model split and neural architecture search (JMSNAS) framework is proposed to automatically generate and deploy a DNN model over a mobile edge network.
Considering both the computing and communication resource constraints, a computational graph search problem is formulated.
Experiment results confirm the superiority of the proposed framework over the state-of-the-art split machine learning design methods.
- Score: 23.230079759174902
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: The main challenge to deploy deep neural network (DNN) over a mobile edge
network is how to split the DNN model so as to match the network architecture
as well as all the nodes' computation and communication capacity. This
essentially involves two highly coupled procedures: model generating and model
splitting. In this paper, a joint model split and neural architecture search
(JMSNAS) framework is proposed to automatically generate and deploy a DNN model
over a mobile edge network. Considering both the computing and communication
resource constraints, a computational graph search problem is formulated to
find the multi-split points of the DNN model, and then the model is trained to
meet some accuracy requirements. Moreover, the trade-off between model accuracy
and completion latency is achieved through the proper design of the objective
function. The experiment results confirm the superiority of the proposed
framework over the state-of-the-art split machine learning design methods.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - An Attempt to Devise a Pairwise Ising-Type Maximum Entropy Model Integrated Cost Function for Optimizing SNN Deployment [0.0]
A spiking neural network (SNN) deployment process often involves partitioning the neural network onto processing units within the neuromorphic hardware.
Finding optimal deployment schemes is an NP-hard problem.
These objectives require consideration of network dynamics shaped by neuron activity patterns.
Our approach focuses on network dynamics, which are hardware-independent and can be modeled separately from specific hardware configurations.
arXiv Detail & Related papers (2024-07-09T16:33:43Z) - Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch [72.26822499434446]
Auto-Train-Once (ATO) is an innovative network pruning algorithm designed to automatically reduce the computational and storage costs of DNNs.
We provide a comprehensive convergence analysis as well as extensive experiments, and the results show that our approach achieves state-of-the-art performance across various model architectures.
arXiv Detail & Related papers (2024-03-21T02:33:37Z) - Split-Et-Impera: A Framework for the Design of Distributed Deep Learning
Applications [8.434224141580758]
Split-Et-Impera determines the set of the best-split points of a neural network based on deep network interpretability principles.
It performs a communication-aware simulation for the rapid evaluation of different neural network rearrangements.
It suggests the best match between the quality of service requirements of the application and the performance in terms of accuracy and latency time.
arXiv Detail & Related papers (2023-03-22T13:00:00Z) - Neural Architecture Search for Improving Latency-Accuracy Trade-off in
Split Computing [5.516431145236317]
Split computing is an emerging machine-learning inference technique that addresses the privacy and latency challenges of deploying deep learning in IoT systems.
In split computing, neural network models are separated and cooperatively processed using edge servers and IoT devices via networks.
This paper proposes a neural architecture search (NAS) method for split computing.
arXiv Detail & Related papers (2022-08-30T03:15:43Z) - Optimal Model Placement and Online Model Splitting for Device-Edge
Co-Inference [22.785214118527872]
Device-edge co-inference opens up new possibilities for resource-constrained wireless devices to execute deep neural network (DNN)-based applications.
We study the joint optimization of the model placement and online model splitting decisions to minimize the energy-and-time cost of device-edge co-inference.
arXiv Detail & Related papers (2021-05-28T06:55:04Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks [61.76338096980383]
A range of neural architecture search (NAS) techniques are used to automatically learn two types of hyper- parameters of state-of-the-art factored time delay neural networks (TDNNs)
These include the DARTS method integrating architecture selection with lattice-free MMI (LF-MMI) TDNN training.
Experiments conducted on a 300-hour Switchboard corpus suggest the auto-configured systems consistently outperform the baseline LF-MMI TDNN systems.
arXiv Detail & Related papers (2020-07-17T08:32:11Z) - Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet.
Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs)
Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.