A Progressive Sub-Network Searching Framework for Dynamic Inference
- URL: http://arxiv.org/abs/2009.05681v1
- Date: Fri, 11 Sep 2020 22:56:02 GMT
- Title: A Progressive Sub-Network Searching Framework for Dynamic Inference
- Authors: Li Yang, Zhezhi He, Yu Cao, Deliang Fan
- Abstract summary: We propose a progressive sub-net searching framework, which is embedded with several effective techniques, including trainable noise ranking, channel group and fine-tuning threshold setting, sub-nets re-selection.
Our proposed method achieves much better dynamic inference accuracy compared with prior popular Universally-Slimmable-Network by 4.4%-maximally and 2.3%-averagely in ImageNet dataset with the same model size.
- Score: 33.93841415140311
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many techniques have been developed, such as model compression, to make Deep
Neural Networks (DNNs) inference more efficiently. Nevertheless, DNNs still
lack excellent run-time dynamic inference capability to enable users trade-off
accuracy and computation complexity (i.e., latency on target hardware) after
model deployment, based on dynamic requirements and environments. Such research
direction recently draws great attention, where one realization is to train the
target DNN through a multiple-term objective function, which consists of
cross-entropy terms from multiple sub-nets. Our investigation in this work show
that the performance of dynamic inference highly relies on the quality of
sub-net sampling. With objective to construct a dynamic DNN and search multiple
high quality sub-nets with minimal searching cost, we propose a progressive
sub-net searching framework, which is embedded with several effective
techniques, including trainable noise ranking, channel group and fine-tuning
threshold setting, sub-nets re-selection. The proposed framework empowers the
target DNN with better dynamic inference capability, which outperforms prior
works on both CIFAR-10 and ImageNet dataset via comprehensive experiments on
different network structures. Taken ResNet18 as an example, our proposed method
achieves much better dynamic inference accuracy compared with prior popular
Universally-Slimmable-Network by 4.4%-maximally and 2.3%-averagely in ImageNet
dataset with the same model size.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Resource-Efficient Sensor Fusion via System-Wide Dynamic Gated Neural Networks [16.0018681576301]
We propose a novel algorithmic strategy called Quantile-constrained Inference (QIC)
QIC makes joint, high-quality, swift decisions on all the above aspects of the system.
Our results confirm that QIC matches the optimum and outperforms its alternatives by over 80%.
arXiv Detail & Related papers (2024-10-22T06:12:04Z) - CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN Workloads [4.556037016746581]
This article addresses the challenges inherent in optimising the execution of deep neural networks (DNNs) on mobile devices.
We introduce CARIn, a novel framework designed for the optimised deployment of both single- and multi-DNN applications.
We observe a substantial enhancement in the fair treatment of the problem's objectives, reaching 1.92x when compared to single-model designs and up to 10.69x in contrast to the state-of-the-art OODIn framework.
arXiv Detail & Related papers (2024-09-02T09:18:11Z) - SimQ-NAS: Simultaneous Quantization Policy and Neural Architecture
Search [6.121126813817338]
Recent one-shot Neural Architecture Search algorithms rely on training a hardware-agnostic super-network tailored to a specific task and then extracting efficient sub-networks for different hardware platforms.
We show that by using multi-objective search algorithms paired with lightly trained predictors, we can efficiently search for both the sub-network architecture and the corresponding quantization policy.
arXiv Detail & Related papers (2023-12-19T22:08:49Z) - Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse
Multi-DNN Workloads [65.47816359465155]
Running multiple deep neural networks (DNNs) in parallel has become an emerging workload in both edge devices.
We propose Dysta, a novel scheduler that utilizes both static sparsity patterns and dynamic sparsity information for the sparse multi-DNN scheduling.
Our proposed approach outperforms the state-of-the-art methods with up to 10% decrease in latency constraint violation rate and nearly 4X reduction in average normalized turnaround time.
arXiv Detail & Related papers (2023-10-17T09:25:17Z) - Towards Enabling Dynamic Convolution Neural Network Inference for Edge
Intelligence [0.0]
Recent advances in edge intelligence require CNN inference on edge network to increase throughput and reduce latency.
To provide flexibility, dynamic parameter allocation to different mobile devices is required to implement either a predefined or defined on-the-fly CNN architecture.
We propose a library-based approach to design scalable and dynamic distributed CNN inference on the fly.
arXiv Detail & Related papers (2022-02-18T22:33:42Z) - Hybrid SNN-ANN: Energy-Efficient Classification and Object Detection for
Event-Based Vision [64.71260357476602]
Event-based vision sensors encode local pixel-wise brightness changes in streams of events rather than image frames.
Recent progress in object recognition from event-based sensors has come from conversions of deep neural networks.
We propose a hybrid architecture for end-to-end training of deep neural networks for event-based pattern recognition and object detection.
arXiv Detail & Related papers (2021-12-06T23:45:58Z) - Learning to Continuously Optimize Wireless Resource in a Dynamic
Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment.
We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes.
Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z) - Fully Dynamic Inference with Deep Neural Networks [19.833242253397206]
Two compact networks, called Layer-Net (L-Net) and Channel-Net (C-Net), predict on a per-instance basis which layers or filters/channels are redundant and therefore should be skipped.
On the CIFAR-10 dataset, LC-Net results in up to 11.9$times$ fewer floating-point operations (FLOPs) and up to 3.3% higher accuracy compared to other dynamic inference methods.
On the ImageNet dataset, LC-Net achieves up to 1.4$times$ fewer FLOPs and up to 4.6% higher Top-1 accuracy than the other methods.
arXiv Detail & Related papers (2020-07-29T23:17:48Z) - Policy-GNN: Aggregation Optimization for Graph Neural Networks [60.50932472042379]
Graph neural networks (GNNs) aim to model the local graph structures and capture the hierarchical patterns by aggregating the information from neighbors.
It is a challenging task to develop an effective aggregation strategy for each node, given complex graphs and sparse features.
We propose Policy-GNN, a meta-policy framework that models the sampling procedure and message passing of GNNs into a combined learning process.
arXiv Detail & Related papers (2020-06-26T17:03:06Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.