Recursive Least Squares for Training and Pruning Convolutional Neural
Networks
- URL: http://arxiv.org/abs/2201.04813v1
- Date: Thu, 13 Jan 2022 07:14:08 GMT
- Title: Recursive Least Squares for Training and Pruning Convolutional Neural
Networks
- Authors: Tianzong Yu, Chunyuan Zhang, Yuan Wang, Meng Ma and Qi Song
- Abstract summary: Convolutional neural networks (CNNs) have succeeded in many practical applications.
High computation and storage requirements make them difficult to deploy on resource-constrained devices.
We propose a novel algorithm for training and pruning CNNs.
- Score: 27.089496826735672
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional neural networks (CNNs) have succeeded in many practical
applications. However, their high computation and storage requirements often
make them difficult to deploy on resource-constrained devices. In order to
tackle this issue, many pruning algorithms have been proposed for CNNs, but
most of them can't prune CNNs to a reasonable level. In this paper, we propose
a novel algorithm for training and pruning CNNs based on the recursive least
squares (RLS) optimization. After training a CNN for some epochs, our algorithm
combines inverse input autocorrelation matrices and weight matrices to evaluate
and prune unimportant input channels or nodes layer by layer. Then, our
algorithm will continue to train the pruned network, and won't do the next
pruning until the pruned network recovers the full performance of the old
network. Besides for CNNs, the proposed algorithm can be used for feedforward
neural networks (FNNs). Three experiments on MNIST, CIFAR-10 and SVHN datasets
show that our algorithm can achieve the more reasonable pruning and have higher
learning efficiency than other four popular pruning algorithms.
Related papers
- Training Convolutional Neural Networks with the Forward-Forward
algorithm [1.74440662023704]
Forward Forward (FF) algorithm has up to now only been used in fully connected networks.
We show how the FF paradigm can be extended to CNNs.
Our FF-trained CNN, featuring a novel spatially-extended labeling technique, achieves a classification accuracy of 99.16% on the MNIST hand-written digits dataset.
arXiv Detail & Related papers (2023-12-22T18:56:35Z) - Class-Aware Pruning for Efficient Neural Networks [5.918784236241883]
Pruning has been introduced to reduce the computational cost in executing deep neural networks (DNNs)
In this paper, we propose a class-aware pruning technique to compress DNNs.
Experimental results confirm that this class-aware pruning technique can significantly reduce the number of weights and FLOPs.
arXiv Detail & Related papers (2023-12-10T13:07:54Z) - Transferability of Convolutional Neural Networks in Stationary Learning
Tasks [96.00428692404354]
We introduce a novel framework for efficient training of convolutional neural networks (CNNs) for large-scale spatial problems.
We show that a CNN trained on small windows of such signals achieves a nearly performance on much larger windows without retraining.
Our results show that the CNN is able to tackle problems with many hundreds of agents after being trained with fewer than ten.
arXiv Detail & Related papers (2023-07-21T13:51:45Z) - A Proximal Algorithm for Network Slimming [2.8148957592979427]
A popular channel pruning method for convolutional neural networks (CNNs) uses subgradient descent to train CNNs.
We develop an alternative algorithm called proximal NS to train CNNs towards sparse, accurate structures.
Our experiments demonstrate that after one round of training, proximal NS yields a CNN with competitive accuracy and compression.
arXiv Detail & Related papers (2023-07-02T23:34:12Z) - You Can Have Better Graph Neural Networks by Not Training Weights at
All: Finding Untrained GNNs Tickets [105.24703398193843]
Untrainedworks in graph neural networks (GNNs) still remains mysterious.
We show that the found untrainedworks can substantially mitigate the GNN over-smoothing problem.
We also observe that such sparse untrainedworks have appealing performance in out-of-distribution detection and robustness of input perturbations.
arXiv Detail & Related papers (2022-11-28T14:17:36Z) - Training Quantized Deep Neural Networks via Cooperative Coevolution [27.967480639403796]
We propose a new method for quantizing deep neural networks (DNNs)
Under the framework of cooperative coevolution, we use the estimation of distribution algorithm to search for the low-bits weights.
Experiments show that our method can train 4 bit ResNet-20 on the Cifar-10 dataset without sacrificing accuracy.
arXiv Detail & Related papers (2021-12-23T09:13:13Z) - A quantum algorithm for training wide and deep classical neural networks [72.2614468437919]
We show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving quantum linear systems.
We numerically demonstrate that the MNIST image dataset satisfies such conditions.
We provide empirical evidence for $O(log n)$ training of a convolutional neural network with pooling.
arXiv Detail & Related papers (2021-07-19T23:41:03Z) - Online Limited Memory Neural-Linear Bandits with Likelihood Matching [53.18698496031658]
We study neural-linear bandits for solving problems where both exploration and representation learning play an important role.
We propose a likelihood matching algorithm that is resilient to catastrophic forgetting and is completely online.
arXiv Detail & Related papers (2021-02-07T14:19:07Z) - RANP: Resource Aware Neuron Pruning at Initialization for 3D CNNs [32.431100361351675]
We introduce a Resource Aware Neuron Pruning (RANP) algorithm that prunes 3D CNNs at high sparsity levels.
Specifically, the core idea is to obtain an importance score for each neuron based on their sensitivity to the loss function.
This neuron importance is then reweighted according to the neuron resource consumption related to FLOPs or memory.
arXiv Detail & Related papers (2020-10-06T05:34:39Z) - Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization.
Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.