JMSNAS: Joint Model Split and Neural Architecture Search for Learning
over Mobile Edge Networks
- URL: http://arxiv.org/abs/2111.08206v1
- Date: Tue, 16 Nov 2021 03:10:23 GMT
- Title: JMSNAS: Joint Model Split and Neural Architecture Search for Learning
over Mobile Edge Networks
- Authors: Yuqing Tian, Zhaoyang Zhang, Zhaohui Yang, Qianqian Yang
- Abstract summary: Joint model split and neural architecture search (JMSNAS) framework is proposed to automatically generate and deploy a DNN model over a mobile edge network.
Considering both the computing and communication resource constraints, a computational graph search problem is formulated.
Experiment results confirm the superiority of the proposed framework over the state-of-the-art split machine learning design methods.
- Score: 23.230079759174902
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: The main challenge to deploy deep neural network (DNN) over a mobile edge
network is how to split the DNN model so as to match the network architecture
as well as all the nodes' computation and communication capacity. This
essentially involves two highly coupled procedures: model generating and model
splitting. In this paper, a joint model split and neural architecture search
(JMSNAS) framework is proposed to automatically generate and deploy a DNN model
over a mobile edge network. Considering both the computing and communication
resource constraints, a computational graph search problem is formulated to
find the multi-split points of the DNN model, and then the model is trained to
meet some accuracy requirements. Moreover, the trade-off between model accuracy
and completion latency is achieved through the proper design of the objective
function. The experiment results confirm the superiority of the proposed
framework over the state-of-the-art split machine learning design methods.
Related papers
- Exploring Neural Network Pruning with Screening Methods [3.443622476405787]
Modern deep learning models have tens of millions of parameters which makes the inference processes resource-intensive.
This paper proposes and evaluates a network pruning framework that eliminates non-essential parameters.
The proposed framework produces competitive lean networks compared to the original networks.
arXiv Detail & Related papers (2025-02-11T02:31:04Z) - NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals [58.83169560132308]
We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of very large neural networks.
NNsight is an open-source system that extends PyTorch to introduce deferred remote execution.
NDIF is a scalable inference service that executes NNsight requests, allowing users to share GPU resources and pretrained models.
arXiv Detail & Related papers (2024-07-18T17:59:01Z) - An Attempt to Devise a Pairwise Ising-Type Maximum Entropy Model Integrated Cost Function for Optimizing SNN Deployment [0.0]
A spiking neural network (SNN) deployment process often involves partitioning the neural network onto processing units within the neuromorphic hardware.
Finding optimal deployment schemes is an NP-hard problem.
These objectives require consideration of network dynamics shaped by neuron activity patterns.
Our approach focuses on network dynamics, which are hardware-independent and can be modeled separately from specific hardware configurations.
arXiv Detail & Related papers (2024-07-09T16:33:43Z) - Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch [72.26822499434446]
Auto-Train-Once (ATO) is an innovative network pruning algorithm designed to automatically reduce the computational and storage costs of DNNs.
We provide a comprehensive convergence analysis as well as extensive experiments, and the results show that our approach achieves state-of-the-art performance across various model architectures.
arXiv Detail & Related papers (2024-03-21T02:33:37Z) - Split-Et-Impera: A Framework for the Design of Distributed Deep Learning
Applications [8.434224141580758]
Split-Et-Impera determines the set of the best-split points of a neural network based on deep network interpretability principles.
It performs a communication-aware simulation for the rapid evaluation of different neural network rearrangements.
It suggests the best match between the quality of service requirements of the application and the performance in terms of accuracy and latency time.
arXiv Detail & Related papers (2023-03-22T13:00:00Z) - Optimal Model Placement and Online Model Splitting for Device-Edge
Co-Inference [22.785214118527872]
Device-edge co-inference opens up new possibilities for resource-constrained wireless devices to execute deep neural network (DNN)-based applications.
We study the joint optimization of the model placement and online model splitting decisions to minimize the energy-and-time cost of device-edge co-inference.
arXiv Detail & Related papers (2021-05-28T06:55:04Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks [61.76338096980383]
A range of neural architecture search (NAS) techniques are used to automatically learn two types of hyper- parameters of state-of-the-art factored time delay neural networks (TDNNs)
These include the DARTS method integrating architecture selection with lattice-free MMI (LF-MMI) TDNN training.
Experiments conducted on a 300-hour Switchboard corpus suggest the auto-configured systems consistently outperform the baseline LF-MMI TDNN systems.
arXiv Detail & Related papers (2020-07-17T08:32:11Z) - Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet.
Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs)
Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.