Related papers: GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization

GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization

URL: http://arxiv.org/abs/2109.02220v2
Date: Wed, 8 Sep 2021 07:51:17 GMT
Title: GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization
Authors: Yi Guo, Huan Yuan, Jianchao Tan, Zhangyang Wang, Sen Yang, Ji Liu
Abstract summary: Gate-based or importance-based pruning methods aim to remove channels whose importance is smallest. GDP can be plugged before convolutional layers without bells and whistles, to control the on-and-off of each channel. Experiments conducted over CIFAR-10 and ImageNet datasets show that the proposed GDP achieves the state-of-the-art performance.
Score: 84.57695474130273
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Model compression techniques are recently gaining explosive attention for obtaining efficient AI models for various real-time applications. Channel pruning is one important compression strategy and is widely used in slimming various DNNs. Previous gate-based or importance-based pruning methods aim to remove channels whose importance is smallest. However, it remains unclear what criteria the channel importance should be measured on, leading to various channel selection heuristics. Some other sampling-based pruning methods deploy sampling strategies to train sub-nets, which often causes the training instability and the compressed model's degraded performance. In view of the research gaps, we present a new module named Gates with Differentiable Polarization (GDP), inspired by principled optimization ideas. GDP can be plugged before convolutional layers without bells and whistles, to control the on-and-off of each channel or whole layer block. During the training process, the polarization effect will drive a subset of gates to smoothly decrease to exact zero, while other gates gradually stay away from zero by a large margin. When training terminates, those zero-gated channels can be painlessly removed, while other non-zero gates can be absorbed into the succeeding convolution kernel, causing completely no interruption to training nor damage to the trained model. Experiments conducted over CIFAR-10 and ImageNet datasets show that the proposed GDP algorithm achieves the state-of-the-art performance on various benchmark DNNs at a broad range of pruning ratios. We also apply GDP to DeepLabV3Plus-ResNet50 on the challenging Pascal VOC segmentation task, whose test performance sees no drop (even slightly improved) with over 60% FLOPs saving.

Related papers

Normalized Attention Guidance: Universal Negative Guidance for Diffusion Models [57.20761595019967]
We present Normalized Attention Guidance (NAG), an efficient, training-free mechanism that applies extrapolation in attention space with L1-based normalization and refinement.<n>NAG restores effective negative guidance where CFG collapses while maintaining fidelity.<n>NAG generalizes across architectures (UNet, DiT), sampling regimes (few-step, multi-step), and modalities (image, video)
arXiv Detail & Related papers (2025-05-27T13:30:46Z)
RL-Pruner: Structured Pruning Using Reinforcement Learning for CNN Compression and Acceleration [0.0]
We propose RL-Pruner, which uses reinforcement learning to learn the optimal pruning distribution. RL-Pruner can automatically extract dependencies between filters in the input model and perform pruning, without requiring model-specific pruning implementations.
arXiv Detail & Related papers (2024-11-10T13:35:10Z)
Instant Complexity Reduction in CNNs using Locality-Sensitive Hashing [50.79602839359522]
We propose HASTE (Hashing for Tractable Efficiency), a parameter-free and data-free module that acts as a plug-and-play replacement for any regular convolution module. We are able to drastically compress latent feature maps without sacrificing much accuracy by using locality-sensitive hashing (LSH) In particular, we are able to instantly drop 46.72% of FLOPs while only losing 1.25% accuracy by just swapping the convolution modules in a ResNet34 on CIFAR-10 for our HASTE module.
arXiv Detail & Related papers (2023-09-29T13:09:40Z)
Binary Early-Exit Network for Adaptive Inference on Low-Resource Devices [3.591566487849146]
Binary neural networks (BNNs) tackle the issue with extreme compression and speed-up gains compared to real-valued models. We propose a simple but effective method to accelerate inference through unifying BNNs with an early-exiting strategy. Our approach allows simple instances to exit early based on a decision threshold and utilizes output layers added to different intermediate layers to avoid executing the entire binary model.
arXiv Detail & Related papers (2022-06-17T22:11:11Z)
CHEX: CHannel EXploration for CNN Model Compression [47.3520447163165]
We propose a novel Channel Exploration methodology, dubbed as CHEX, to rectify these problems. CheX repeatedly prunes and regrows the channels throughout the training process, which reduces the risk of pruning important channels prematurely. Results demonstrate that CHEX can effectively reduce the FLOPs of diverse CNN architectures on a variety of computer vision tasks.
arXiv Detail & Related papers (2022-03-29T17:52:41Z)
CATRO: Channel Pruning via Class-Aware Trace Ratio Optimization [61.71504948770445]
We propose a novel channel pruning method via Class-Aware Trace Ratio Optimization (CATRO) to reduce the computational burden and accelerate the model inference. We show that CATRO achieves higher accuracy with similar cost or lower cost with similar accuracy than other state-of-the-art channel pruning algorithms. Because of its class-aware property, CATRO is suitable to prune efficient networks adaptively for various classification subtasks, enhancing handy deployment and usage of deep networks in real-world applications.
arXiv Detail & Related papers (2021-10-21T06:26:31Z)
Only Train Once: A One-Shot Neural Network Training And Pruning Framework [31.959625731943675]
Structured pruning is a commonly used technique in deploying deep neural networks (DNNs) onto resource-constrained devices. We propose a framework that DNNs are slimmer with competitive performances and significant FLOPs reductions by Only-Train-Once (OTO) OTO contains two keys: (i) we partition the parameters of DNNs into zero-invariant groups, enabling us to prune zero groups without affecting the output; and (ii) to promote zero groups, we then formulate a structured-Image optimization algorithm, Half-Space Projected (HSPG) To demonstrate the effectiveness of OTO, we train and
arXiv Detail & Related papers (2021-07-15T17:15:20Z)
BWCP: Probabilistic Learning-to-Prune Channels for ConvNets via Batch Whitening [63.081808698068365]
This work presents a probabilistic channel pruning method to accelerate Convolutional Neural Networks (CNNs) Previous pruning methods often zero out unimportant channels in training in a deterministic manner, which reduces CNN's learning capacity and results in suboptimal performance. We develop a probability-based pruning algorithm, called batch whitening channel pruning (BWCP), which canally discard unimportant channels by modeling the probability of a channel being activated.
arXiv Detail & Related papers (2021-05-13T17:00:05Z)
DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator Search [55.164053971213576]
convolutional neural network has achieved great success in fulfilling computer vision tasks despite large computation overhead. Structured (channel) pruning is usually applied to reduce the model redundancy while preserving the network structure. Existing structured pruning methods require hand-crafted rules which may lead to tremendous pruning space.
arXiv Detail & Related papers (2020-11-04T07:43:01Z)
Discrimination-aware Network Pruning for Deep Model Compression [79.44318503847136]
Existing pruning methods either train from scratch with sparsity constraints or minimize the reconstruction error between the feature maps of the pre-trained models and the compressed ones. We propose a simple-yet-effective method called discrimination-aware channel pruning (DCP) to choose the channels that actually contribute to the discriminative power. Experiments on both image classification and face recognition demonstrate the effectiveness of our methods.
arXiv Detail & Related papers (2020-01-04T07:07:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.