Related papers: InstantNet: Automated Generation and Deployment of Instantaneously Switchable-Precision Networks

InstantNet: Automated Generation and Deployment of Instantaneously Switchable-Precision Networks

URL: http://arxiv.org/abs/2104.10853v1
Date: Thu, 22 Apr 2021 04:07:43 GMT
Title: InstantNet: Automated Generation and Deployment of Instantaneously Switchable-Precision Networks
Authors: Yonggan Fu, Zhongzhi Yu, Yongan Zhang, Yifan Jiang, Chaojian Li, Yongyuan Liang, Mingchao Jiang, Zhangyang Wang, Yingyan Lin
Abstract summary: We propose InstantNet to automatically generate and deploy instantaneously switchable-precision networks which operate at variable bit-widths. In experiments, the proposed InstantNet consistently outperforms state-of-the-art designs.
Score: 65.78061366594106
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The promise of Deep Neural Network (DNN) powered Internet of Thing (IoT) devices has motivated a tremendous demand for automated solutions to enable fast development and deployment of efficient (1) DNNs equipped with instantaneous accuracy-efficiency trade-off capability to accommodate the time-varying resources at IoT devices and (2) dataflows to optimize DNNs' execution efficiency on different devices. Therefore, we propose InstantNet to automatically generate and deploy instantaneously switchable-precision networks which operate at variable bit-widths. Extensive experiments show that the proposed InstantNet consistently outperforms state-of-the-art designs.

Related papers

Resource Efficient Asynchronous Federated Learning for Digital Twin Empowered IoT Network [29.895766751146155]
Digital twin (DT) can provide real-time status and dynamic topology mapping for Internet of Things (IoT) devices. We develop a dynamic resource scheduling algorithm tailored for the asynchronous federated learning (FL)-based lightweight DT empowered IoT network. Specifically, our approach aims to minimize a multi-objective function that encompasses both energy consumption and latency.
arXiv Detail & Related papers (2024-08-26T14:28:51Z)
An Efficient Real-Time Object Detection Framework on Resource-Constricted Hardware Devices via Software and Hardware Co-design [11.857890662690448]
This paper proposes an efficient real-time object detection framework on resource-constrained hardware devices through hardware and software co-design. Results show that the proposed method can significantly reduce the model size and improve the execution time.
arXiv Detail & Related papers (2024-08-02T18:47:11Z)
Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs [74.83613252825754]
"smart ecosystems" are being formed where sensing happens concurrently rather than standalone. This is shifting the on-device inference paradigm towards deploying neural processing units (NPUs) at the edge. We propose a novel early-exit scheduling that allows preemption at run time to account for the dynamicity introduced by the arrival and exiting processes.
arXiv Detail & Related papers (2022-09-27T15:04:01Z)
Efficient Sparsely Activated Transformers [0.34410212782758054]
Transformer-based neural networks have achieved state-of-the-art task performance in a number of machine learning domains. Recent work has explored the integration of dynamic behavior into these networks in the form of mixture-of-expert layers. We introduce a novel system named PLANER that takes an existing Transformer-based network and a user-defined latency target.
arXiv Detail & Related papers (2022-08-31T00:44:27Z)
FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task. The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources. It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z)
2-in-1 Accelerator: Enabling Random Precision Switch for Winning Both Adversarial Robustness and Efficiency [26.920864182619844]
We propose a 2-in-1 Accelerator aiming at winning both the adversarial robustness and efficiency of DNN accelerators. Specifically, we first propose a Random Precision Switch (RPS) algorithm that can effectively defend DNNs against adversarial attacks. Furthermore, we propose a new precision-scalable accelerator featuring (1) a new precision-scalable unit architecture.
arXiv Detail & Related papers (2021-09-11T08:51:01Z)
Multi-Exit Semantic Segmentation Networks [78.44441236864057]
We propose a framework for converting state-of-the-art segmentation models to MESS networks. specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples. We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
arXiv Detail & Related papers (2021-06-07T11:37:03Z)
Dynamic Slimmable Network [105.74546828182834]
We develop a dynamic network slimming regime named Dynamic Slimmable Network (DS-Net) Our DS-Net is empowered with the ability of dynamic inference by the proposed double-headed dynamic gate. It consistently outperforms its static counterparts as well as state-of-the-art static and dynamic model compression methods.
arXiv Detail & Related papers (2021-03-24T15:25:20Z)
EmotionNet Nano: An Efficient Deep Convolutional Neural Network Design for Real-time Facial Expression Recognition [75.74756992992147]
This study proposes EmotionNet Nano, an efficient deep convolutional neural network created through a human-machine collaborative design strategy. Two different variants of EmotionNet Nano are presented, each with a different trade-off between architectural and computational complexity and accuracy. We demonstrate that the proposed EmotionNet Nano networks achieved real-time inference speeds (e.g. $>25$ FPS and $>70$ FPS at 15W and 30W, respectively) and high energy efficiency.
arXiv Detail & Related papers (2020-06-29T00:48:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.