Related papers: Kronecker CP Decomposition with Fast Multiplication for Compressing RNNs

Kronecker CP Decomposition with Fast Multiplication for Compressing RNNs

URL: http://arxiv.org/abs/2008.09342v2
Date: Fri, 24 Sep 2021 12:19:16 GMT
Title: Kronecker CP Decomposition with Fast Multiplication for Compressing RNNs
Authors: Dingheng Wang and Bijiao Wu and Guangshe Zhao and Man Yao and Hengnu Chen and Lei Deng and Tianyi Yan and Guoqi Li
Abstract summary: Recurrent neural networks (RNNs) are powerful in the tasks oriented to sequential data, such as natural language processing and video recognition. In this paper, we consider compressing RNNs based on a novel Kronecker CANDECOMP/PARAFAC (KCP) decomposition.
Score: 11.01184134911405
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recurrent neural networks (RNNs) are powerful in the tasks oriented to sequential data, such as natural language processing and video recognition. However, since the modern RNNs, including long-short term memory (LSTM) and gated recurrent unit (GRU) networks, have complex topologies and expensive space/computation complexity, compressing them becomes a hot and promising topic in recent years. Among plenty of compression methods, tensor decomposition, e.g., tensor train (TT), block term (BT), tensor ring (TR) and hierarchical Tucker (HT), appears to be the most amazing approach since a very high compression ratio might be obtained. Nevertheless, none of these tensor decomposition formats can provide both the space and computation efficiency. In this paper, we consider to compress RNNs based on a novel Kronecker CANDECOMP/PARAFAC (KCP) decomposition, which is derived from Kronecker tensor (KT) decomposition, by proposing two fast algorithms of multiplication between the input and the tensor-decomposed weight. According to our experiments based on UCF11, Youtube Celebrities Face and UCF50 datasets, it can be verified that the proposed KCP-RNNs have comparable performance of accuracy with those in other tensor-decomposed formats, and even 278,219x compression ratio could be obtained by the low rank KCP. More importantly, KCP-RNNs are efficient in both space and computation complexity compared with other tensor-decomposed ones under similar ranks. Besides, we find KCP has the best potential for parallel computing to accelerate the calculations in neural networks.

Related papers

Tensor Decomposition Networks for Fast Machine Learning Interatomic Potential Computations [63.945006006152035]
tensor decomposition networks (TDNs) achieve competitive performance with dramatic speedup in computations.<n>We evaluate TDNs on PubChemQCR, a newly curated molecular relaxation dataset containing 105 million DFT-calculated snapshots.
arXiv Detail & Related papers (2025-07-01T18:46:27Z)
Accelerating two-dimensional tensor network contractions using QR-decompositions [3.6498714804297387]
We propose a contraction scheme for $C_4v$-symmetric tensor networks based on combining the corner transfer matrix renormalization group with QR-decompositions. Our approach achieves up to two orders of magnitude speedup compared to standard CTMRG and yields state-of-the-art results.
arXiv Detail & Related papers (2025-05-01T12:48:26Z)
"Lossless" Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach [49.744093838327615]
We provide a novel compression approach to wide and fully-connected emphdeep neural nets. Experiments on both synthetic and real-world data are conducted to support the advantages of the proposed compression scheme.
arXiv Detail & Related papers (2024-03-01T03:46:28Z)
Scalable CP Decomposition for Tensor Learning using GPU Tensor Cores [47.87810316745786]
We propose a compression-based tensor decomposition framework, namely the exascale-tensor, to support exascale tensor decomposition. Compared to the baselines, the exascale-tensor supports 8,000x larger tensors and a speedup up to 6.95x. We also apply our method to two real-world applications, including gene analysis and tensor layer neural networks.
arXiv Detail & Related papers (2023-11-22T21:04:59Z)
Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel. Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU. Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z)
Latent Matrices for Tensor Network Decomposition and to Tensor Completion [8.301418317685906]
We propose a novel higher-order tensor decomposition model that decomposes the tensor into smaller ones and speeds up the computation of the algorithm. Three optimization algorithms, LMTN-PAM, LMTN-SVD and LMTN-AR, have been developed and applied to the tensor-completion task. Experimental results show that our LMTN-SVD algorithm is 3-6 times faster than the FCTN-PAM algorithm and only a 1.8 points accuracy drop.
arXiv Detail & Related papers (2022-10-07T08:19:50Z)
How to Train Unstable Looped Tensor Network [21.882898731132443]
A rising problem in the compression of Deep Neural Networks is how to reduce the number of parameters in convolutional kernels. We propose novel methods to gain the stability of the decomposition results, keep the network robust and attain better approximation.
arXiv Detail & Related papers (2022-03-05T00:17:04Z)
Semi-tensor Product-based TensorDecomposition for Neural Network Compression [57.95644775091316]
This paper generalizes classical matrix product-based mode product to semi-tensor mode product. As it permits the connection of two factors with different dimensionality, more flexible and compact tensor decompositions can be obtained.
arXiv Detail & Related papers (2021-09-30T15:18:14Z)
Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch [75.69506249886622]
Sparsity in Deep Neural Networks (DNNs) has been widely studied to compress and accelerate the models on resource-constrained environments. In this paper, we are the first to study training from scratch an N:M fine-grained structured sparse network.
arXiv Detail & Related papers (2021-02-08T05:55:47Z)
Hybrid Tensor Decomposition in Neural Network Compression [13.146051056642904]
We introduce the hierarchical Tucker (HT) decomposition method to investigate its capability in neural network compression. We experimentally discover that the HT format has better performance on compressing weight matrices, while the TT format is more suited for compressing convolutional kernels.
arXiv Detail & Related papers (2020-06-29T11:16:22Z)
Tensor train decompositions on recurrent networks [60.334946204107446]
Matrix product state (MPS) tensor trains have more attractive features than MPOs, in terms of storage reduction and computing time at inference. We show that MPS tensor trains should be at the forefront of LSTM network compression through a theoretical analysis and practical experiments on NLP task.
arXiv Detail & Related papers (2020-06-09T18:25:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.