Related papers: A Progressive Sub-Network Searching Framework for Dynamic Inference

A Progressive Sub-Network Searching Framework for Dynamic Inference

URL: http://arxiv.org/abs/2009.05681v1
Date: Fri, 11 Sep 2020 22:56:02 GMT
Title: A Progressive Sub-Network Searching Framework for Dynamic Inference
Authors: Li Yang, Zhezhi He, Yu Cao, Deliang Fan
Abstract summary: We propose a progressive sub-net searching framework, which is embedded with several effective techniques, including trainable noise ranking, channel group and fine-tuning threshold setting, sub-nets re-selection. Our proposed method achieves much better dynamic inference accuracy compared with prior popular Universally-Slimmable-Network by 4.4%-maximally and 2.3%-averagely in ImageNet dataset with the same model size.
Score: 33.93841415140311
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Many techniques have been developed, such as model compression, to make Deep Neural Networks (DNNs) inference more efficiently. Nevertheless, DNNs still lack excellent run-time dynamic inference capability to enable users trade-off accuracy and computation complexity (i.e., latency on target hardware) after model deployment, based on dynamic requirements and environments. Such research direction recently draws great attention, where one realization is to train the target DNN through a multiple-term objective function, which consists of cross-entropy terms from multiple sub-nets. Our investigation in this work show that the performance of dynamic inference highly relies on the quality of sub-net sampling. With objective to construct a dynamic DNN and search multiple high quality sub-nets with minimal searching cost, we propose a progressive sub-net searching framework, which is embedded with several effective techniques, including trainable noise ranking, channel group and fine-tuning threshold setting, sub-nets re-selection. The proposed framework empowers the target DNN with better dynamic inference capability, which outperforms prior works on both CIFAR-10 and ImageNet dataset via comprehensive experiments on different network structures. Taken ResNet18 as an example, our proposed method achieves much better dynamic inference accuracy compared with prior popular Universally-Slimmable-Network by 4.4%-maximally and 2.3%-averagely in ImageNet dataset with the same model size.

Related papers

CREST: An Efficient Conjointly-trained Spike-driven Framework for Event-based Object Detection Exploiting Spatiotemporal Dynamics [7.696109414724968]
Spiking neural networks (SNNs) are promising for event-based object recognition and detection. Existing SNN frameworks often fail to handle multi-scaletemporal features, leading to increased data redundancy and reduced accuracy. We propose CREST, a novel conjointly-trained spike-driven framework to exploit event-based object detection.
arXiv Detail & Related papers (2024-12-17T04:33:31Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Resource-Efficient Sensor Fusion via System-Wide Dynamic Gated Neural Networks [16.0018681576301]
We propose a novel algorithmic strategy called Quantile-constrained Inference (QIC) QIC makes joint, high-quality, swift decisions on all the above aspects of the system. Our results confirm that QIC matches the optimum and outperforms its alternatives by over 80%.
arXiv Detail & Related papers (2024-10-22T06:12:04Z)
CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN Workloads [4.556037016746581]
This article addresses the challenges inherent in optimising the execution of deep neural networks (DNNs) on mobile devices. We introduce CARIn, a novel framework designed for the optimised deployment of both single- and multi-DNN applications. We observe a substantial enhancement in the fair treatment of the problem's objectives, reaching 1.92x when compared to single-model designs and up to 10.69x in contrast to the state-of-the-art OODIn framework.
arXiv Detail & Related papers (2024-09-02T09:18:11Z)
SimQ-NAS: Simultaneous Quantization Policy and Neural Architecture Search [6.121126813817338]
Recent one-shot Neural Architecture Search algorithms rely on training a hardware-agnostic super-network tailored to a specific task and then extracting efficient sub-networks for different hardware platforms. We show that by using multi-objective search algorithms paired with lightly trained predictors, we can efficiently search for both the sub-network architecture and the corresponding quantization policy.
arXiv Detail & Related papers (2023-12-19T22:08:49Z)
Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads [65.47816359465155]
Running multiple deep neural networks (DNNs) in parallel has become an emerging workload in both edge devices. We propose Dysta, a novel scheduler that utilizes both static sparsity patterns and dynamic sparsity information for the sparse multi-DNN scheduling. Our proposed approach outperforms the state-of-the-art methods with up to 10% decrease in latency constraint violation rate and nearly 4X reduction in average normalized turnaround time.
arXiv Detail & Related papers (2023-10-17T09:25:17Z)
Towards Enabling Dynamic Convolution Neural Network Inference for Edge Intelligence [0.0]
Recent advances in edge intelligence require CNN inference on edge network to increase throughput and reduce latency. To provide flexibility, dynamic parameter allocation to different mobile devices is required to implement either a predefined or defined on-the-fly CNN architecture. We propose a library-based approach to design scalable and dynamic distributed CNN inference on the fly.
arXiv Detail & Related papers (2022-02-18T22:33:42Z)
Hybrid SNN-ANN: Energy-Efficient Classification and Object Detection for Event-Based Vision [64.71260357476602]
Event-based vision sensors encode local pixel-wise brightness changes in streams of events rather than image frames. Recent progress in object recognition from event-based sensors has come from conversions of deep neural networks. We propose a hybrid architecture for end-to-end training of deep neural networks for event-based pattern recognition and object detection.
arXiv Detail & Related papers (2021-12-06T23:45:58Z)
Learning to Continuously Optimize Wireless Resource in a Dynamic Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment. We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes. Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z)
Fully Dynamic Inference with Deep Neural Networks [19.833242253397206]
Two compact networks, called Layer-Net (L-Net) and Channel-Net (C-Net), predict on a per-instance basis which layers or filters/channels are redundant and therefore should be skipped. On the CIFAR-10 dataset, LC-Net results in up to 11.9$times$ fewer floating-point operations (FLOPs) and up to 3.3% higher accuracy compared to other dynamic inference methods. On the ImageNet dataset, LC-Net achieves up to 1.4$times$ fewer FLOPs and up to 4.6% higher Top-1 accuracy than the other methods.
arXiv Detail & Related papers (2020-07-29T23:17:48Z)
Policy-GNN: Aggregation Optimization for Graph Neural Networks [60.50932472042379]
Graph neural networks (GNNs) aim to model the local graph structures and capture the hierarchical patterns by aggregating the information from neighbors. It is a challenging task to develop an effective aggregation strategy for each node, given complex graphs and sparse features. We propose Policy-GNN, a meta-policy framework that models the sampling procedure and message passing of GNNs into a combined learning process.
arXiv Detail & Related papers (2020-06-26T17:03:06Z)
Dynamic Hierarchical Mimicking Towards Consistent Optimization Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability. Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network. Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.