Related papers: Hybrid Tensor Decomposition in Neural Network Compression

Hybrid Tensor Decomposition in Neural Network Compression

URL: http://arxiv.org/abs/2006.15938v3
Date: Mon, 21 Sep 2020 02:14:21 GMT
Title: Hybrid Tensor Decomposition in Neural Network Compression
Authors: Bijiao Wu, Dingheng Wang, Guangshe Zhao, Lei Deng and Guoqi Li
Abstract summary: We introduce the hierarchical Tucker (HT) decomposition method to investigate its capability in neural network compression. We experimentally discover that the HT format has better performance on compressing weight matrices, while the TT format is more suited for compressing convolutional kernels.
Score: 13.146051056642904
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks (DNNs) have enabled impressive breakthroughs in various artificial intelligence (AI) applications recently due to its capability of learning high-level features from big data. However, the current demand of DNNs for computational resources especially the storage consumption is growing due to that the increasing sizes of models are being required for more and more complicated applications. To address this problem, several tensor decomposition methods including tensor-train (TT) and tensor-ring (TR) have been applied to compress DNNs and shown considerable compression effectiveness. In this work, we introduce the hierarchical Tucker (HT), a classical but rarely-used tensor decomposition method, to investigate its capability in neural network compression. We convert the weight matrices and convolutional kernels to both HT and TT formats for comparative study, since the latter is the most widely used decomposition method and the variant of HT. We further theoretically and experimentally discover that the HT format has better performance on compressing weight matrices, while the TT format is more suited for compressing convolutional kernels. Based on this phenomenon we propose a strategy of hybrid tensor decomposition by combining TT and HT together to compress convolutional and fully connected parts separately and attain better accuracy than only using the TT or HT format on convolutional neural networks (CNNs). Our work illuminates the prospects of hybrid tensor decomposition for neural network compression.

Related papers

Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition [93.98343072306619]
We present Navier-GaLore, a novel method for efficient training of neural networks with higher-order tensor weights. Across various PDE tasks, Navier-GaLore achieves substantial memory savings, reducing memory usage by up to 75%.
arXiv Detail & Related papers (2025-01-04T20:51:51Z)
Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval. A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed. The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
"Lossless" Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach [49.744093838327615]
We provide a novel compression approach to wide and fully-connected emphdeep neural nets. Experiments on both synthetic and real-world data are conducted to support the advantages of the proposed compression scheme.
arXiv Detail & Related papers (2024-03-01T03:46:28Z)
Efficient Tensor Robust PCA under Hybrid Model of Tucker and Tensor Train [33.33426557160802]
We propose an efficient principal component analysis (TRPCA) under hybrid model of Tucker and TT. Specifically, in theory we reveal that TT nuclear norm (TTNN) of the original big tensor can be equivalently converted to that of a much smaller tensor via a Tucker compression format. Numerical experiments on both synthetic and real-world tensor data verify the superiority of the proposed model.
arXiv Detail & Related papers (2021-12-20T01:15:45Z)
Semi-tensor Product-based TensorDecomposition for Neural Network Compression [57.95644775091316]
This paper generalizes classical matrix product-based mode product to semi-tensor mode product. As it permits the connection of two factors with different dimensionality, more flexible and compact tensor decompositions can be obtained.
arXiv Detail & Related papers (2021-09-30T15:18:14Z)
Compact representations of convolutional neural networks via weight pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization. We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z)
Towards Efficient Tensor Decomposition-Based DNN Model Compression with Optimization Framework [14.27609385208807]
We propose a systematic framework for tensor decomposition-based model compression using Alternating Direction Method of Multipliers (ADMM) Our framework is very general, and it works for both CNNs and RNNs. Experimental results show that our ADMM-based TT-format models demonstrate very high compression performance with high accuracy.
arXiv Detail & Related papers (2021-07-26T18:31:33Z)
An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems [77.88178159830905]
Sparsity-Inducing Distribution-based Compression (SIDCo) is a threshold-based sparsification scheme that enjoys similar threshold estimation quality to deep gradient compression (DGC) Our evaluation shows SIDCo speeds up training by up to 41:7%, 7:6%, and 1:9% compared to the no-compression baseline, Topk, and DGC compressors, respectively.
arXiv Detail & Related papers (2021-01-26T13:06:00Z)
A Fully Tensorized Recurrent Neural Network [48.50376453324581]
We introduce a "fully tensorized" RNN architecture which jointly encodes the separate weight matrices within each recurrent cell. This approach reduces model size by several orders of magnitude, while still maintaining similar or better performance compared to standard RNNs.
arXiv Detail & Related papers (2020-10-08T18:24:12Z)
Kronecker CP Decomposition with Fast Multiplication for Compressing RNNs [11.01184134911405]
Recurrent neural networks (RNNs) are powerful in the tasks oriented to sequential data, such as natural language processing and video recognition. In this paper, we consider compressing RNNs based on a novel Kronecker CANDECOMP/PARAFAC (KCP) decomposition.
arXiv Detail & Related papers (2020-08-21T07:29:45Z)
Tensor train decompositions on recurrent networks [60.334946204107446]
Matrix product state (MPS) tensor trains have more attractive features than MPOs, in terms of storage reduction and computing time at inference. We show that MPS tensor trains should be at the forefront of LSTM network compression through a theoretical analysis and practical experiments on NLP task.
arXiv Detail & Related papers (2020-06-09T18:25:39Z)
Compressing Recurrent Neural Networks Using Hierarchical Tucker Tensor Decomposition [39.76939368675827]
Recurrent Neural Networks (RNNs) have been widely used in sequence analysis and modeling. RNNs typically require very large model sizes when processing high-dimensional data. We propose to develop compact RNN models using Hierarchical Tucker (HT) decomposition.
arXiv Detail & Related papers (2020-05-09T05:15:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.