InstantNet: Automated Generation and Deployment of Instantaneously
Switchable-Precision Networks
- URL: http://arxiv.org/abs/2104.10853v1
- Date: Thu, 22 Apr 2021 04:07:43 GMT
- Title: InstantNet: Automated Generation and Deployment of Instantaneously
Switchable-Precision Networks
- Authors: Yonggan Fu, Zhongzhi Yu, Yongan Zhang, Yifan Jiang, Chaojian Li,
Yongyuan Liang, Mingchao Jiang, Zhangyang Wang, Yingyan Lin
- Abstract summary: We propose InstantNet to automatically generate and deploy instantaneously switchable-precision networks which operate at variable bit-widths.
In experiments, the proposed InstantNet consistently outperforms state-of-the-art designs.
- Score: 65.78061366594106
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The promise of Deep Neural Network (DNN) powered Internet of Thing (IoT)
devices has motivated a tremendous demand for automated solutions to enable
fast development and deployment of efficient (1) DNNs equipped with
instantaneous accuracy-efficiency trade-off capability to accommodate the
time-varying resources at IoT devices and (2) dataflows to optimize DNNs'
execution efficiency on different devices. Therefore, we propose InstantNet to
automatically generate and deploy instantaneously switchable-precision networks
which operate at variable bit-widths. Extensive experiments show that the
proposed InstantNet consistently outperforms state-of-the-art designs.
Related papers
- Resource Efficient Asynchronous Federated Learning for Digital Twin Empowered IoT Network [29.895766751146155]
Digital twin (DT) can provide real-time status and dynamic topology mapping for Internet of Things (IoT) devices.
We develop a dynamic resource scheduling algorithm tailored for the asynchronous federated learning (FL)-based lightweight DT empowered IoT network.
Specifically, our approach aims to minimize a multi-objective function that encompasses both energy consumption and latency.
arXiv Detail & Related papers (2024-08-26T14:28:51Z) - An Efficient Real-Time Object Detection Framework on Resource-Constricted Hardware Devices via Software and Hardware Co-design [11.857890662690448]
This paper proposes an efficient real-time object detection framework on resource-constrained hardware devices through hardware and software co-design.
Results show that the proposed method can significantly reduce the model size and improve the execution time.
arXiv Detail & Related papers (2024-08-02T18:47:11Z) - Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural
Networks on Edge NPUs [74.83613252825754]
"smart ecosystems" are being formed where sensing happens concurrently rather than standalone.
This is shifting the on-device inference paradigm towards deploying neural processing units (NPUs) at the edge.
We propose a novel early-exit scheduling that allows preemption at run time to account for the dynamicity introduced by the arrival and exiting processes.
arXiv Detail & Related papers (2022-09-27T15:04:01Z) - Efficient Sparsely Activated Transformers [0.34410212782758054]
Transformer-based neural networks have achieved state-of-the-art task performance in a number of machine learning domains.
Recent work has explored the integration of dynamic behavior into these networks in the form of mixture-of-expert layers.
We introduce a novel system named PLANER that takes an existing Transformer-based network and a user-defined latency target.
arXiv Detail & Related papers (2022-08-31T00:44:27Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - 2-in-1 Accelerator: Enabling Random Precision Switch for Winning Both
Adversarial Robustness and Efficiency [26.920864182619844]
We propose a 2-in-1 Accelerator aiming at winning both the adversarial robustness and efficiency of DNN accelerators.
Specifically, we first propose a Random Precision Switch (RPS) algorithm that can effectively defend DNNs against adversarial attacks.
Furthermore, we propose a new precision-scalable accelerator featuring (1) a new precision-scalable unit architecture.
arXiv Detail & Related papers (2021-09-11T08:51:01Z) - Multi-Exit Semantic Segmentation Networks [78.44441236864057]
We propose a framework for converting state-of-the-art segmentation models to MESS networks.
specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples.
We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
arXiv Detail & Related papers (2021-06-07T11:37:03Z) - Dynamic Slimmable Network [105.74546828182834]
We develop a dynamic network slimming regime named Dynamic Slimmable Network (DS-Net)
Our DS-Net is empowered with the ability of dynamic inference by the proposed double-headed dynamic gate.
It consistently outperforms its static counterparts as well as state-of-the-art static and dynamic model compression methods.
arXiv Detail & Related papers (2021-03-24T15:25:20Z) - EmotionNet Nano: An Efficient Deep Convolutional Neural Network Design
for Real-time Facial Expression Recognition [75.74756992992147]
This study proposes EmotionNet Nano, an efficient deep convolutional neural network created through a human-machine collaborative design strategy.
Two different variants of EmotionNet Nano are presented, each with a different trade-off between architectural and computational complexity and accuracy.
We demonstrate that the proposed EmotionNet Nano networks achieved real-time inference speeds (e.g. $>25$ FPS and $>70$ FPS at 15W and 30W, respectively) and high energy efficiency.
arXiv Detail & Related papers (2020-06-29T00:48:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.