Hybrid Tensor Decomposition in Neural Network Compression
- URL: http://arxiv.org/abs/2006.15938v3
- Date: Mon, 21 Sep 2020 02:14:21 GMT
- Title: Hybrid Tensor Decomposition in Neural Network Compression
- Authors: Bijiao Wu, Dingheng Wang, Guangshe Zhao, Lei Deng and Guoqi Li
- Abstract summary: We introduce the hierarchical Tucker (HT) decomposition method to investigate its capability in neural network compression.
We experimentally discover that the HT format has better performance on compressing weight matrices, while the TT format is more suited for compressing convolutional kernels.
- Score: 13.146051056642904
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) have enabled impressive breakthroughs in various
artificial intelligence (AI) applications recently due to its capability of
learning high-level features from big data. However, the current demand of DNNs
for computational resources especially the storage consumption is growing due
to that the increasing sizes of models are being required for more and more
complicated applications. To address this problem, several tensor decomposition
methods including tensor-train (TT) and tensor-ring (TR) have been applied to
compress DNNs and shown considerable compression effectiveness. In this work,
we introduce the hierarchical Tucker (HT), a classical but rarely-used tensor
decomposition method, to investigate its capability in neural network
compression. We convert the weight matrices and convolutional kernels to both
HT and TT formats for comparative study, since the latter is the most widely
used decomposition method and the variant of HT. We further theoretically and
experimentally discover that the HT format has better performance on
compressing weight matrices, while the TT format is more suited for compressing
convolutional kernels. Based on this phenomenon we propose a strategy of hybrid
tensor decomposition by combining TT and HT together to compress convolutional
and fully connected parts separately and attain better accuracy than only using
the TT or HT format on convolutional neural networks (CNNs). Our work
illuminates the prospects of hybrid tensor decomposition for neural network
compression.
Related papers
- "Lossless" Compression of Deep Neural Networks: A High-dimensional
Neural Tangent Kernel Approach [49.744093838327615]
We provide a novel compression approach to wide and fully-connected emphdeep neural nets.
Experiments on both synthetic and real-world data are conducted to support the advantages of the proposed compression scheme.
arXiv Detail & Related papers (2024-03-01T03:46:28Z) - Efficient Tensor Robust PCA under Hybrid Model of Tucker and Tensor
Train [33.33426557160802]
We propose an efficient principal component analysis (TRPCA) under hybrid model of Tucker and TT.
Specifically, in theory we reveal that TT nuclear norm (TTNN) of the original big tensor can be equivalently converted to that of a much smaller tensor via a Tucker compression format.
Numerical experiments on both synthetic and real-world tensor data verify the superiority of the proposed model.
arXiv Detail & Related papers (2021-12-20T01:15:45Z) - Semi-tensor Product-based TensorDecomposition for Neural Network
Compression [57.95644775091316]
This paper generalizes classical matrix product-based mode product to semi-tensor mode product.
As it permits the connection of two factors with different dimensionality, more flexible and compact tensor decompositions can be obtained.
arXiv Detail & Related papers (2021-09-30T15:18:14Z) - Compact representations of convolutional neural networks via weight
pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization.
We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z) - Towards Efficient Tensor Decomposition-Based DNN Model Compression with
Optimization Framework [14.27609385208807]
We propose a systematic framework for tensor decomposition-based model compression using Alternating Direction Method of Multipliers (ADMM)
Our framework is very general, and it works for both CNNs and RNNs.
Experimental results show that our ADMM-based TT-format models demonstrate very high compression performance with high accuracy.
arXiv Detail & Related papers (2021-07-26T18:31:33Z) - An Efficient Statistical-based Gradient Compression Technique for
Distributed Training Systems [77.88178159830905]
Sparsity-Inducing Distribution-based Compression (SIDCo) is a threshold-based sparsification scheme that enjoys similar threshold estimation quality to deep gradient compression (DGC)
Our evaluation shows SIDCo speeds up training by up to 41:7%, 7:6%, and 1:9% compared to the no-compression baseline, Topk, and DGC compressors, respectively.
arXiv Detail & Related papers (2021-01-26T13:06:00Z) - A Fully Tensorized Recurrent Neural Network [48.50376453324581]
We introduce a "fully tensorized" RNN architecture which jointly encodes the separate weight matrices within each recurrent cell.
This approach reduces model size by several orders of magnitude, while still maintaining similar or better performance compared to standard RNNs.
arXiv Detail & Related papers (2020-10-08T18:24:12Z) - Kronecker CP Decomposition with Fast Multiplication for Compressing RNNs [11.01184134911405]
Recurrent neural networks (RNNs) are powerful in the tasks oriented to sequential data, such as natural language processing and video recognition.
In this paper, we consider compressing RNNs based on a novel Kronecker CANDECOMP/PARAFAC (KCP) decomposition.
arXiv Detail & Related papers (2020-08-21T07:29:45Z) - Tensor train decompositions on recurrent networks [60.334946204107446]
Matrix product state (MPS) tensor trains have more attractive features than MPOs, in terms of storage reduction and computing time at inference.
We show that MPS tensor trains should be at the forefront of LSTM network compression through a theoretical analysis and practical experiments on NLP task.
arXiv Detail & Related papers (2020-06-09T18:25:39Z) - Compressing Recurrent Neural Networks Using Hierarchical Tucker Tensor
Decomposition [39.76939368675827]
Recurrent Neural Networks (RNNs) have been widely used in sequence analysis and modeling.
RNNs typically require very large model sizes when processing high-dimensional data.
We propose to develop compact RNN models using Hierarchical Tucker (HT) decomposition.
arXiv Detail & Related papers (2020-05-09T05:15:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.