Lite it fly: An All-Deformable-Butterfly Network
- URL: http://arxiv.org/abs/2311.08125v1
- Date: Tue, 14 Nov 2023 12:41:22 GMT
- Title: Lite it fly: An All-Deformable-Butterfly Network
- Authors: Rui Lin, Jason Chun Lok Li, Jiajun Zhou, Binxiao Huang, Jie Ran and
Ngai Wong
- Abstract summary: Most deep neural networks (DNNs) consist fundamentally of convolutional and/or fully connected layers.
The lately proposed deformable butterfly (DeBut) decomposes the filter matrix into generalized, butterflylike factors.
This work reveals an intimate link between DeBut and a systematic hierarchy of depthwise and pointwise convolutions.
- Score: 7.8460795568982435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most deep neural networks (DNNs) consist fundamentally of convolutional
and/or fully connected layers, wherein the linear transform can be cast as the
product between a filter matrix and a data matrix obtained by arranging feature
tensors into columns. The lately proposed deformable butterfly (DeBut)
decomposes the filter matrix into generalized, butterflylike factors, thus
achieving network compression orthogonal to the traditional ways of pruning or
low-rank decomposition. This work reveals an intimate link between DeBut and a
systematic hierarchy of depthwise and pointwise convolutions, which explains
the empirically good performance of DeBut layers. By developing an automated
DeBut chain generator, we show for the first time the viability of homogenizing
a DNN into all DeBut layers, thus achieving an extreme sparsity and
compression. Various examples and hardware benchmarks verify the advantages of
All-DeBut networks. In particular, we show it is possible to compress a
PointNet to < 5% parameters with < 5% accuracy drop, a record not achievable by
other compression schemes.
Related papers
- Layer-Specific Optimization: Sensitivity Based Convolution Layers Basis Search [0.0]
We propose a new way of applying the matrix decomposition with respect to the weights of convolutional layers.
The essence of the method is to train not all convolutions, but only the subset of convolutions (basis convolutions) and represent the rest as linear combinations of the basis ones.
Experiments on models from the ResNet family and the CIFAR-10 dataset demonstrate that basis convolutions can not only reduce the size of the model but also accelerate the forward and backward passes of the network.
arXiv Detail & Related papers (2024-08-12T09:24:48Z) - ButterflyFlow: Building Invertible Layers with Butterfly Matrices [80.83142511616262]
We propose a new family of invertible linear layers based on butterfly layers.
Based on our invertible butterfly layers, we construct a new class of normalizing flow models called ButterflyFlow.
arXiv Detail & Related papers (2022-09-28T01:58:18Z) - Dynamic Probabilistic Pruning: A general framework for
hardware-constrained pruning at different granularities [80.06422693778141]
We propose a flexible new pruning mechanism that facilitates pruning at different granularities (weights, kernels, filters/feature maps)
We refer to this algorithm as Dynamic Probabilistic Pruning (DPP)
We show that DPP achieves competitive compression rates and classification accuracy when pruning common deep learning models trained on different benchmark datasets for image classification.
arXiv Detail & Related papers (2021-05-26T17:01:52Z) - ReduNet: A White-box Deep Network from the Principle of Maximizing Rate
Reduction [32.489371527159236]
This work attempts to provide a plausible theoretical framework that aims to interpret modern deep (convolutional) networks from the principles of data compression and discriminative representation.
We show that for high-dimensional multi-class data, the optimal linear discriminative representation maximizes the coding rate difference between the whole dataset and the average of all the subsets.
We show that the basic iterative gradient ascent scheme for optimizing the rate reduction objective naturally leads to a multi-layer deep network, named ReduNet, that shares common characteristics of modern deep networks.
arXiv Detail & Related papers (2021-05-21T16:29:57Z) - A Deeper Look into Convolutions via Pruning [9.89901717499058]
Modern architectures contain a very small number of fully-connected layers, often at the end, after multiple layers of convolutions.
Although this strategy already reduces the number of parameters, most of the convolutions can be eliminated as well, without suffering any loss in recognition performance.
In this work, we use the matrix characteristics based on eigenvalues in addition to the classical weight-based importance assignment approach for pruning to shed light on the internal mechanisms of a widely used family of CNNs.
arXiv Detail & Related papers (2021-02-04T18:55:03Z) - Permute, Quantize, and Fine-tune: Efficient Compression of Neural
Networks [70.0243910593064]
Key to success of vector quantization is deciding which parameter groups should be compressed together.
In this paper we make the observation that the weights of two adjacent layers can be permuted while expressing the same function.
We then establish a connection to rate-distortion theory and search for permutations that result in networks that are easier to compress.
arXiv Detail & Related papers (2020-10-29T15:47:26Z) - Unfolding Neural Networks for Compressive Multichannel Blind
Deconvolution [71.29848468762789]
We propose a learned-structured unfolding neural network for the problem of compressive sparse multichannel blind-deconvolution.
In this problem, each channel's measurements are given as convolution of a common source signal and sparse filter.
We demonstrate that our method is superior to classical structured compressive sparse multichannel blind-deconvolution methods in terms of accuracy and speed of sparse filter recovery.
arXiv Detail & Related papers (2020-10-22T02:34:33Z) - Sparse Linear Networks with a Fixed Butterfly Structure: Theory and
Practice [4.3400407844814985]
We propose to replace a dense linear layer in any neural network by an architecture based on the butterfly network.
In a collection of experiments, including supervised prediction on both the NLP and vision data, we show that this not only produces results that match and at times outperform existing well-known architectures.
arXiv Detail & Related papers (2020-07-17T09:45:03Z) - DHP: Differentiable Meta Pruning via HyperNetworks [158.69345612783198]
This paper introduces a differentiable pruning method via hypernetworks for automatic network pruning.
Latent vectors control the output channels of the convolutional layers in the backbone network and act as a handle for the pruning of the layers.
Experiments are conducted on various networks for image classification, single image super-resolution, and denoising.
arXiv Detail & Related papers (2020-03-30T17:59:18Z) - Group Sparsity: The Hinge Between Filter Pruning and Decomposition for
Network Compression [145.04742985050808]
We analyze two popular network compression techniques, i.e. filter pruning and low-rank decomposition, in a unified sense.
By changing the way the sparsity regularization is enforced, filter pruning and low-rank decomposition can be derived accordingly.
Our approach proves its potential as it compares favorably to the state-of-the-art on several benchmarks.
arXiv Detail & Related papers (2020-03-19T17:57:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.