Related papers: A Proximal Algorithm for Network Slimming

A Proximal Algorithm for Network Slimming

URL: http://arxiv.org/abs/2307.00684v2
Date: Tue, 30 Jan 2024 08:51:01 GMT
Title: A Proximal Algorithm for Network Slimming
Authors: Kevin Bui, Fanghui Xue, Fredrick Park, Yingyong Qi, Jack Xin
Abstract summary: A popular channel pruning method for convolutional neural networks (CNNs) uses subgradient descent to train CNNs. We develop an alternative algorithm called proximal NS to train CNNs towards sparse, accurate structures. Our experiments demonstrate that after one round of training, proximal NS yields a CNN with competitive accuracy and compression.
Score: 2.8148957592979427
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As a popular channel pruning method for convolutional neural networks (CNNs), network slimming (NS) has a three-stage process: (1) it trains a CNN with $\ell_1$ regularization applied to the scaling factors of the batch normalization layers; (2) it removes channels whose scaling factors are below a chosen threshold; and (3) it retrains the pruned model to recover the original accuracy. This time-consuming, three-step process is a result of using subgradient descent to train CNNs. Because subgradient descent does not exactly train CNNs towards sparse, accurate structures, the latter two steps are necessary. Moreover, subgradient descent does not have any convergence guarantee. Therefore, we develop an alternative algorithm called proximal NS. Our proposed algorithm trains CNNs towards sparse, accurate structures, so identifying a scaling factor threshold is unnecessary and fine tuning the pruned CNNs is optional. Using Kurdyka-{\L}ojasiewicz assumptions, we establish global convergence of proximal NS. Lastly, we validate the efficacy of the proposed algorithm on VGGNet, DenseNet and ResNet on CIFAR 10/100. Our experiments demonstrate that after one round of training, proximal NS yields a CNN with competitive accuracy and compression.

Related papers

On the rates of convergence for learning with convolutional neural networks [9.772773527230134]
We study approximation and learning capacities of convolutional neural networks (CNNs) with one-side zero-padding and multiple channels. We derive convergence rates for estimators based on CNNs in many learning problems. It is also shown that the obtained rates for classification are minimax optimal in some common settings.
arXiv Detail & Related papers (2024-03-25T06:42:02Z)
OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation [70.17681136234202]
We reexamine the design distinctions and test the limits of what a sparse CNN can achieve. We propose two key components, i.e., adaptive receptive fields (spatially) and adaptive relation, to bridge the gap. This exploration led to the creation of Omni-Adaptive 3D CNNs (OA-CNNs), a family of networks that integrates a lightweight module.
arXiv Detail & Related papers (2024-03-21T14:06:38Z)
Improved techniques for deterministic l2 robustness [63.34032156196848]
Training convolutional neural networks (CNNs) with a strict 1-Lipschitz constraint under the $l_2$ norm is useful for adversarial robustness, interpretable gradients and stable training. We introduce a procedure to certify robustness of 1-Lipschitz CNNs by replacing the last linear layer with a 1-hidden layer. We significantly advance the state-of-the-art for standard and provable robust accuracies on CIFAR-10 and CIFAR-100.
arXiv Detail & Related papers (2022-11-15T19:10:12Z)
A Passive Similarity based CNN Filter Pruning for Efficient Acoustic Scene Classification [23.661189257759535]
We present a method to develop low-complexity convolutional neural networks (CNNs) for acoustic scene classification (ASC) We propose a passive filter pruning framework, where a few convolutional filters from the CNNs are eliminated to yield compressed CNNs. The proposed method is simple, reduces computations per inference by 27%, with 25% fewer parameters, with less than 1% drop in accuracy.
arXiv Detail & Related papers (2022-03-29T17:00:06Z)
Recursive Least Squares for Training and Pruning Convolutional Neural Networks [27.089496826735672]
Convolutional neural networks (CNNs) have succeeded in many practical applications. High computation and storage requirements make them difficult to deploy on resource-constrained devices. We propose a novel algorithm for training and pruning CNNs.
arXiv Detail & Related papers (2022-01-13T07:14:08Z)
Adaptive Nearest Neighbor Machine Translation [60.97183408140499]
kNN-MT combines pre-trained neural machine translation with token-level k-nearest-neighbor retrieval. Traditional kNN algorithm simply retrieves a same number of nearest neighbors for each target token. We propose Adaptive kNN-MT to dynamically determine the number of k for each target token.
arXiv Detail & Related papers (2021-05-27T09:27:42Z)
BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks. Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z)
ACP: Automatic Channel Pruning via Clustering and Swarm Intelligence Optimization for CNN [6.662639002101124]
convolutional neural network (CNN) gets deeper and wider in recent years. Existing magnitude-based pruning methods are efficient, but the performance of the compressed network is unpredictable. We propose a novel automatic channel pruning method (ACP) ACP is evaluated against several state-of-the-art CNNs on three different classification datasets.
arXiv Detail & Related papers (2021-01-16T08:56:38Z)
Deep learning for gravitational-wave data analysis: A resampling white-box approach [62.997667081978825]
We apply Convolutional Neural Networks (CNNs) to detect gravitational wave (GW) signals of compact binary coalescences, using single-interferometer data from LIGO detectors. CNNs were quite precise to detect noise but not sensitive enough to recall GW signals, meaning that CNNs are better for noise reduction than generation of GW triggers.
arXiv Detail & Related papers (2020-09-09T03:28:57Z)
Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes. We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.