MatchNAS: Optimizing Edge AI in Sparse-Label Data Contexts via
Automating Deep Neural Network Porting for Mobile Deployment
- URL: http://arxiv.org/abs/2402.13525v1
- Date: Wed, 21 Feb 2024 04:43:12 GMT
- Title: MatchNAS: Optimizing Edge AI in Sparse-Label Data Contexts via
Automating Deep Neural Network Porting for Mobile Deployment
- Authors: Hongtao Huang, Xiaojun Chang, Wen Hu and Lina Yao
- Abstract summary: MatchNAS is a novel scheme for porting Deep Neural Networks to mobile devices.
We optimise a large network family using both labelled and unlabelled data.
We then automatically search for tailored networks for different hardware platforms.
- Score: 54.77943671991863
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent years have seen the explosion of edge intelligence with powerful Deep
Neural Networks (DNNs). One popular scheme is training DNNs on powerful cloud
servers and subsequently porting them to mobile devices after being
lightweight. Conventional approaches manually specialized DNNs for various edge
platforms and retrain them with real-world data. However, as the number of
platforms increases, these approaches become labour-intensive and
computationally prohibitive. Additionally, real-world data tends to be
sparse-label, further increasing the difficulty of lightweight models. In this
paper, we propose MatchNAS, a novel scheme for porting DNNs to mobile devices.
Specifically, we simultaneously optimise a large network family using both
labelled and unlabelled data and then automatically search for tailored
networks for different hardware platforms. MatchNAS acts as an intermediary
that bridges the gap between cloud-based DNNs and edge-based DNNs.
Related papers
- NAS-BNN: Neural Architecture Search for Binary Neural Networks [55.058512316210056]
We propose a novel neural architecture search scheme for binary neural networks, named NAS-BNN.
Our discovered binary model family outperforms previous BNNs for a wide range of operations (OPs) from 20M to 200M.
In addition, we validate the transferability of these searched BNNs on the object detection task, and our binary detectors with the searched BNNs achieve a novel state-of-the-art result, e.g., 31.6% mAP with 370M OPs, on MS dataset.
arXiv Detail & Related papers (2024-08-28T02:17:58Z) - A Converting Autoencoder Toward Low-latency and Energy-efficient DNN
Inference at the Edge [4.11949030493552]
We present CBNet, a low-latency and energy-efficient deep neural network (DNN) inference framework tailored for edge devices.
It utilizes a "converting" autoencoder to efficiently transform hard images into easy ones.
CBNet achieves up to 4.8x speedup in inference latency and 79% reduction in energy usage compared to competing techniques.
arXiv Detail & Related papers (2024-03-11T08:13:42Z) - SwapNet: Efficient Swapping for DNN Inference on Edge AI Devices Beyond
the Memory Budget [18.63754969602021]
Deep neural networks (DNNs) on edge artificial intelligence (AI) devices enable various autonomous mobile computing applications.
Existing solutions, such as model compression or cloud offloading, reduce the memory footprint of DNN inference.
We develop SwapNet, an efficient block swapping ecosystem for edge AI devices.
arXiv Detail & Related papers (2024-01-30T05:29:49Z) - Salted Inference: Enhancing Privacy while Maintaining Efficiency of
Split Inference in Mobile Computing [8.915849482780631]
In split inference, a deep neural network (DNN) is partitioned to run the early part of the DNN at the edge and the later part of the DNN in the cloud.
This meets two key requirements for on-device machine learning: input privacy and computation efficiency.
We introduce Salted DNNs: a novel approach that enables clients at the edge, who run the early part of the DNN, to control the semantic interpretation of the DNN's outputs at inference time.
arXiv Detail & Related papers (2023-10-20T09:53:55Z) - Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse
Multi-DNN Workloads [65.47816359465155]
Running multiple deep neural networks (DNNs) in parallel has become an emerging workload in both edge devices.
We propose Dysta, a novel scheduler that utilizes both static sparsity patterns and dynamic sparsity information for the sparse multi-DNN scheduling.
Our proposed approach outperforms the state-of-the-art methods with up to 10% decrease in latency constraint violation rate and nearly 4X reduction in average normalized turnaround time.
arXiv Detail & Related papers (2023-10-17T09:25:17Z) - Enabling Deep Learning on Edge Devices [2.741266294612776]
Deep neural networks (DNNs) have succeeded in many different perception tasks, e.g., computer vision, natural language processing, reinforcement learning, etc.
The high-performed DNNs heavily rely on intensive resource consumption.
Recently, some new emerging intelligent applications, e.g., AR/VR, mobile assistants, Internet of Things, require us to deploy DNNs on resource-constrained edge devices.
In this dissertation, we studied four edge intelligence scenarios, i.e., Inference on Edge Devices, Adaptation on Edge Devices, Learning on Edge Devices, and Edge-Server Systems
arXiv Detail & Related papers (2022-10-06T20:52:57Z) - Training Graph Neural Networks with 1000 Layers [133.84813995275988]
We study reversible connections, group convolutions, weight tying, and equilibrium models to advance the memory and parameter efficiency of GNNs.
To the best of our knowledge, RevGNN-Deep is the deepest GNN in the literature by one order of magnitude.
arXiv Detail & Related papers (2021-06-14T15:03:00Z) - Dynamic DNN Decomposition for Lossless Synergistic Inference [0.9549013615433989]
Deep neural networks (DNNs) sustain high performance in today's data processing applications.
We propose D3, a dynamic DNN decomposition system for synergistic inference without precision loss.
D3 outperforms the state-of-the-art counterparts up to 3.4 times in end-to-end DNN inference time and reduces backbone network communication overhead up to 3.68 times.
arXiv Detail & Related papers (2021-01-15T03:18:53Z) - Deep Time Delay Neural Network for Speech Enhancement with Full Data
Learning [60.20150317299749]
This paper proposes a deep time delay neural network (TDNN) for speech enhancement with full data learning.
To make full use of the training data, we propose a full data learning method for speech enhancement.
arXiv Detail & Related papers (2020-11-11T06:32:37Z) - Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art results in many different problem settings.
DNNs are often treated as black box systems, which complicates their evaluation and validation.
One promising field, inspired by the success of convolutional neural networks (CNNs) in computer vision tasks, is to incorporate knowledge about symmetric geometrical transformations.
arXiv Detail & Related papers (2020-06-30T14:56:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.