DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator
Search
- URL: http://arxiv.org/abs/2011.02166v2
- Date: Thu, 7 Apr 2022 07:29:46 GMT
- Title: DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator
Search
- Authors: Yushuo Guan, Ning Liu, Pengyu Zhao, Zhengping Che, Kaigui Bian, Yanzhi
Wang, Jian Tang
- Abstract summary: convolutional neural network has achieved great success in fulfilling computer vision tasks despite large computation overhead.
Structured (channel) pruning is usually applied to reduce the model redundancy while preserving the network structure.
Existing structured pruning methods require hand-crafted rules which may lead to tremendous pruning space.
- Score: 55.164053971213576
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The convolutional neural network has achieved great success in fulfilling
computer vision tasks despite large computation overhead against efficient
deployment. Structured (channel) pruning is usually applied to reduce the model
redundancy while preserving the network structure, such that the pruned network
can be easily deployed in practice. However, existing structured pruning
methods require hand-crafted rules which may lead to tremendous pruning space.
In this paper, we introduce Differentiable Annealing Indicator Search (DAIS)
that leverages the strength of neural architecture search in the channel
pruning and automatically searches for the effective pruned model with given
constraints on computation overhead. Specifically, DAIS relaxes the binarized
channel indicators to be continuous and then jointly learns both indicators and
model parameters via bi-level optimization. To bridge the non-negligible
discrepancy between the continuous model and the target binarized model, DAIS
proposes an annealing-based procedure to steer the indicator convergence
towards binarized states. Moreover, DAIS designs various regularizations based
on a priori structural knowledge to control the pruning sparsity and to improve
model performance. Experimental results show that DAIS outperforms
state-of-the-art pruning methods on CIFAR-10, CIFAR-100, and ImageNet.
Related papers
- Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling [9.20186865054847]
Anomaly detection (AD) is increasingly recognized as a key component for ensuring the resilience of future communication systems.
This work considers AD in network flows using incomplete measurements.
We propose a novel block-successive convex approximation algorithm based on a regularized model-fitting objective.
Inspired by Bayesian approaches, we extend the model architecture to perform online adaptation to per-flow and per-time-step statistics.
arXiv Detail & Related papers (2024-09-17T19:59:57Z) - Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch [72.26822499434446]
Auto-Train-Once (ATO) is an innovative network pruning algorithm designed to automatically reduce the computational and storage costs of DNNs.
We provide a comprehensive convergence analysis as well as extensive experiments, and the results show that our approach achieves state-of-the-art performance across various model architectures.
arXiv Detail & Related papers (2024-03-21T02:33:37Z) - Accelerating Deep Neural Networks via Semi-Structured Activation
Sparsity [0.0]
Exploiting sparsity in the network's feature maps is one of the ways to reduce its inference latency.
We propose a solution to induce semi-structured activation sparsity exploitable through minor runtime modifications.
Our approach yields a speed improvement of $1.25 times$ with a minimal accuracy drop of $1.1%$ for the ResNet18 model on the ImageNet dataset.
arXiv Detail & Related papers (2023-09-12T22:28:53Z) - Graph-based Algorithm Unfolding for Energy-aware Power Allocation in
Wireless Networks [27.600081147252155]
We develop a novel graph sumable framework to maximize energy efficiency in wireless communication networks.
We show the permutation training which is a desirable property for models of wireless network data.
Results demonstrate its generalizability across different network topologies.
arXiv Detail & Related papers (2022-01-27T20:23:24Z) - CATRO: Channel Pruning via Class-Aware Trace Ratio Optimization [61.71504948770445]
We propose a novel channel pruning method via Class-Aware Trace Ratio Optimization (CATRO) to reduce the computational burden and accelerate the model inference.
We show that CATRO achieves higher accuracy with similar cost or lower cost with similar accuracy than other state-of-the-art channel pruning algorithms.
Because of its class-aware property, CATRO is suitable to prune efficient networks adaptively for various classification subtasks, enhancing handy deployment and usage of deep networks in real-world applications.
arXiv Detail & Related papers (2021-10-21T06:26:31Z) - Layer Pruning on Demand with Intermediate CTC [50.509073206630994]
We present a training and pruning method for ASR based on the connectionist temporal classification (CTC)
We show that a Transformer-CTC model can be pruned in various depth on demand, improving real-time factor from 0.005 to 0.002 on GPU.
arXiv Detail & Related papers (2021-06-17T02:40:18Z) - TSAM: Temporal Link Prediction in Directed Networks based on
Self-Attention Mechanism [2.5144068869465994]
We propose a deep learning model based on graph neural networks (GCN) and self-attention mechanism, namely TSAM.
We run comparative experiments on four realistic networks to validate the effectiveness of TSAM.
arXiv Detail & Related papers (2020-08-23T11:56:40Z) - Operation-Aware Soft Channel Pruning using Differentiable Masks [51.04085547997066]
We propose a data-driven algorithm, which compresses deep neural networks in a differentiable way by exploiting the characteristics of operations.
We perform extensive experiments and achieve outstanding performance in terms of the accuracy of output networks.
arXiv Detail & Related papers (2020-07-08T07:44:00Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.