Related papers: Tensor network compressibility of convolutional models

Tensor network compressibility of convolutional models

URL: http://arxiv.org/abs/2403.14379v2
Date: Mon, 19 Aug 2024 16:37:36 GMT
Title: Tensor network compressibility of convolutional models
Authors: Sukhbinder Singh, Saeed S. Jahromi, Roman Orus,
Abstract summary: We assess how textittruncating the convolution kernels of textitdense (untensorized) CNNs impact their accuracy. We found that kernels could often be truncated along several cuts resulting in significant loss in kernel norm but not in classification accuracy.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Convolutional neural networks (CNNs) are one of the most widely used neural network architectures, showcasing state-of-the-art performance in computer vision tasks. Although larger CNNs generally exhibit higher accuracy, their size can be effectively reduced by ``tensorization'' while maintaining accuracy, namely, replacing the convolution kernels with compact decompositions such as Tucker, Canonical Polyadic decompositions, or quantum-inspired decompositions such as matrix product states, and directly training the factors in the decompositions to bias the learning towards low-rank decompositions. But why doesn't tensorization seem to impact the accuracy adversely? We explore this by assessing how \textit{truncating} the convolution kernels of \textit{dense} (untensorized) CNNs impact their accuracy. Specifically, we truncated the kernels of (i) a vanilla four-layer CNN and (ii) ResNet-50 pre-trained for image classification on CIFAR-10 and CIFAR-100 datasets. We found that kernels (especially those inside deeper layers) could often be truncated along several cuts resulting in significant loss in kernel norm but not in classification accuracy. This suggests that such ``correlation compression'' (underlying tensorization) is an intrinsic feature of how information is encoded in dense CNNs. We also found that aggressively truncated models could often recover the pre-truncation accuracy after only a few epochs of re-training, suggesting that compressing the internal correlations of convolution layers does not often transport the model to a worse minimum. Our results can be applied to tensorize and compress CNN models more effectively.

Related papers

Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data [24.86314525762012]
We show that ReLU CNN trained by gradient descent can achieve near Bayes-optimal accuracy. Our result demonstrates that CNNs have a remarkable capacity to efficiently learn XOR problems, even in the presence of highly correlated features.
arXiv Detail & Related papers (2023-10-03T11:31:37Z)
What Can Be Learnt With Wide Convolutional Neural Networks? [69.55323565255631]
We study infinitely-wide deep CNNs in the kernel regime. We prove that deep CNNs adapt to the spatial scale of the target function. We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN.
arXiv Detail & Related papers (2022-08-01T17:19:32Z)
Large-Margin Representation Learning for Texture Classification [67.94823375350433]
This paper presents a novel approach combining convolutional layers (CLs) and large-margin metric learning for training supervised models on small datasets for texture classification. The experimental results on texture and histopathologic image datasets have shown that the proposed approach achieves competitive accuracy with lower computational cost and faster convergence when compared to equivalent CNNs.
arXiv Detail & Related papers (2022-06-17T04:07:45Z)
Do We Really Need a Learnable Classifier at the End of Deep Neural Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training. Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z)
Examining and Mitigating Kernel Saturation in Convolutional Neural Networks using Negative Images [0.8594140167290097]
We analyze the effect of convolutional kernel saturation in CNNs. We propose a simple data augmentation technique to mitigate saturation and increase classification accuracy, by supplementing negative images to the training dataset. Our results show that CNNs are indeed susceptible to convolutional kernel saturation and that supplementing negative images to the training dataset can offer a statistically significant increase in classification accuracies.
arXiv Detail & Related papers (2021-05-10T06:06:49Z)
BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks. Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z)
ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN. We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)
Stable Low-rank Tensor Decomposition for Compression of Convolutional Neural Network [19.717842489217684]
This paper is the first study on degeneracy in the tensor decomposition of convolutional kernels. We present a novel method, which can stabilize the low-rank approximation of convolutional kernels and ensure efficient compression. We evaluate our approach on popular CNN architectures for image classification and show that our method results in much lower accuracy degradation and provides consistent performance.
arXiv Detail & Related papers (2020-08-12T17:10:12Z)
Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes. We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.