A Progressive Sub-Network Searching Framework for Dynamic Inference
- URL: http://arxiv.org/abs/2009.05681v1
- Date: Fri, 11 Sep 2020 22:56:02 GMT
- Title: A Progressive Sub-Network Searching Framework for Dynamic Inference
- Authors: Li Yang, Zhezhi He, Yu Cao, Deliang Fan
- Abstract summary: We propose a progressive sub-net searching framework, which is embedded with several effective techniques, including trainable noise ranking, channel group and fine-tuning threshold setting, sub-nets re-selection.
Our proposed method achieves much better dynamic inference accuracy compared with prior popular Universally-Slimmable-Network by 4.4%-maximally and 2.3%-averagely in ImageNet dataset with the same model size.
- Score: 33.93841415140311
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many techniques have been developed, such as model compression, to make Deep
Neural Networks (DNNs) inference more efficiently. Nevertheless, DNNs still
lack excellent run-time dynamic inference capability to enable users trade-off
accuracy and computation complexity (i.e., latency on target hardware) after
model deployment, based on dynamic requirements and environments. Such research
direction recently draws great attention, where one realization is to train the
target DNN through a multiple-term objective function, which consists of
cross-entropy terms from multiple sub-nets. Our investigation in this work show
that the performance of dynamic inference highly relies on the quality of
sub-net sampling. With objective to construct a dynamic DNN and search multiple
high quality sub-nets with minimal searching cost, we propose a progressive
sub-net searching framework, which is embedded with several effective
techniques, including trainable noise ranking, channel group and fine-tuning
threshold setting, sub-nets re-selection. The proposed framework empowers the
target DNN with better dynamic inference capability, which outperforms prior
works on both CIFAR-10 and ImageNet dataset via comprehensive experiments on
different network structures. Taken ResNet18 as an example, our proposed method
achieves much better dynamic inference accuracy compared with prior popular
Universally-Slimmable-Network by 4.4%-maximally and 2.3%-averagely in ImageNet
dataset with the same model size.
Related papers
- SimQ-NAS: Simultaneous Quantization Policy and Neural Architecture
Search [6.121126813817338]
Recent one-shot Neural Architecture Search algorithms rely on training a hardware-agnostic super-network tailored to a specific task and then extracting efficient sub-networks for different hardware platforms.
We show that by using multi-objective search algorithms paired with lightly trained predictors, we can efficiently search for both the sub-network architecture and the corresponding quantization policy.
arXiv Detail & Related papers (2023-12-19T22:08:49Z) - Med-DANet V2: A Flexible Dynamic Architecture for Efficient Medical
Volumetric Segmentation [29.082411035685773]
A dynamic architecture network for medical segmentation (i.e. Med-DANet) has achieved a favorable accuracy and efficiency trade-off.
This paper explores a unified formulation of the dynamic inference framework from the perspective of both the data itself and the model structure.
Our framework improves the model efficiency by up to nearly 4.1 and 17.3 times with comparable segmentation results on BraTS 2019.
arXiv Detail & Related papers (2023-10-28T09:57:28Z) - Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse
Multi-DNN Workloads [65.47816359465155]
Running multiple deep neural networks (DNNs) in parallel has become an emerging workload in both edge devices.
We propose Dysta, a novel scheduler that utilizes both static sparsity patterns and dynamic sparsity information for the sparse multi-DNN scheduling.
Our proposed approach outperforms the state-of-the-art methods with up to 10% decrease in latency constraint violation rate and nearly 4X reduction in average normalized turnaround time.
arXiv Detail & Related papers (2023-10-17T09:25:17Z) - Towards Enabling Dynamic Convolution Neural Network Inference for Edge
Intelligence [0.0]
Recent advances in edge intelligence require CNN inference on edge network to increase throughput and reduce latency.
To provide flexibility, dynamic parameter allocation to different mobile devices is required to implement either a predefined or defined on-the-fly CNN architecture.
We propose a library-based approach to design scalable and dynamic distributed CNN inference on the fly.
arXiv Detail & Related papers (2022-02-18T22:33:42Z) - Hybrid SNN-ANN: Energy-Efficient Classification and Object Detection for
Event-Based Vision [64.71260357476602]
Event-based vision sensors encode local pixel-wise brightness changes in streams of events rather than image frames.
Recent progress in object recognition from event-based sensors has come from conversions of deep neural networks.
We propose a hybrid architecture for end-to-end training of deep neural networks for event-based pattern recognition and object detection.
arXiv Detail & Related papers (2021-12-06T23:45:58Z) - Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling
on Heterogeneous Embedded Platforms [3.3197851873862385]
This paper proposes Dynamic-OFA, a novel dynamic DNN approach for state-of-the-art platform-aware NAS models (i.e. Once-for-all network (OFA))
Compared to the state-of-the-art, our experimental results using ImageNet on a Jetson Xavier NX show that the approach is up to 3.5x faster for similar ImageNet Top-1 accuracy.
arXiv Detail & Related papers (2021-05-08T05:10:53Z) - Learning to Continuously Optimize Wireless Resource in a Dynamic
Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment.
We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes.
Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z) - Multi-path Neural Networks for On-device Multi-domain Visual
Classification [55.281139434736254]
This paper proposes a novel approach to automatically learn a multi-path network for multi-domain visual classification on mobile devices.
The proposed multi-path network is learned from neural architecture search by applying one reinforcement learning controller for each domain to select the best path in the super-network created from a MobileNetV3-like search space.
The determined multi-path model selectively shares parameters across domains in shared nodes while keeping domain-specific parameters within non-shared nodes in individual domain paths.
arXiv Detail & Related papers (2020-10-10T05:13:49Z) - Fully Dynamic Inference with Deep Neural Networks [19.833242253397206]
Two compact networks, called Layer-Net (L-Net) and Channel-Net (C-Net), predict on a per-instance basis which layers or filters/channels are redundant and therefore should be skipped.
On the CIFAR-10 dataset, LC-Net results in up to 11.9$times$ fewer floating-point operations (FLOPs) and up to 3.3% higher accuracy compared to other dynamic inference methods.
On the ImageNet dataset, LC-Net achieves up to 1.4$times$ fewer FLOPs and up to 4.6% higher Top-1 accuracy than the other methods.
arXiv Detail & Related papers (2020-07-29T23:17:48Z) - Policy-GNN: Aggregation Optimization for Graph Neural Networks [60.50932472042379]
Graph neural networks (GNNs) aim to model the local graph structures and capture the hierarchical patterns by aggregating the information from neighbors.
It is a challenging task to develop an effective aggregation strategy for each node, given complex graphs and sparse features.
We propose Policy-GNN, a meta-policy framework that models the sampling procedure and message passing of GNNs into a combined learning process.
arXiv Detail & Related papers (2020-06-26T17:03:06Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.