Compressing Recurrent Neural Networks Using Hierarchical Tucker Tensor
Decomposition
- URL: http://arxiv.org/abs/2005.04366v1
- Date: Sat, 9 May 2020 05:15:20 GMT
- Title: Compressing Recurrent Neural Networks Using Hierarchical Tucker Tensor
Decomposition
- Authors: Miao Yin, Siyu Liao, Xiao-Yang Liu, Xiaodong Wang, Bo Yuan
- Abstract summary: Recurrent Neural Networks (RNNs) have been widely used in sequence analysis and modeling.
RNNs typically require very large model sizes when processing high-dimensional data.
We propose to develop compact RNN models using Hierarchical Tucker (HT) decomposition.
- Score: 39.76939368675827
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recurrent Neural Networks (RNNs) have been widely used in sequence analysis
and modeling. However, when processing high-dimensional data, RNNs typically
require very large model sizes, thereby bringing a series of deployment
challenges. Although the state-of-the-art tensor decomposition approaches can
provide good model compression performance, these existing methods are still
suffering some inherent limitations, such as restricted representation
capability and insufficient model complexity reduction. To overcome these
limitations, in this paper we propose to develop compact RNN models using
Hierarchical Tucker (HT) decomposition. HT decomposition brings strong
hierarchical structure to the decomposed RNN models, which is very useful and
important for enhancing the representation capability. Meanwhile, HT
decomposition provides higher storage and computational cost reduction than the
existing tensor decomposition approaches for RNN compression. Our experimental
results show that, compared with the state-of-the-art compressed RNN models,
such as TT-LSTM, TR-LSTM and BT-LSTM, our proposed HT-based LSTM (HT-LSTM),
consistently achieves simultaneous and significant increases in both
compression ratio and test accuracy on different datasets.
Related papers
- STN: Scalable Tensorizing Networks via Structure-Aware Training and
Adaptive Compression [10.067082377396586]
We propose Scalableizing Networks (STN), which adaptively adjust the model size and decomposition structure without retraining.
STN is compatible with arbitrary network architectures and achieves higher compression performance and flexibility over other tensorizing versions.
arXiv Detail & Related papers (2022-05-30T15:50:48Z) - Exploiting Low-Rank Tensor-Train Deep Neural Networks Based on
Riemannian Gradient Descent With Illustrations of Speech Processing [74.31472195046099]
We exploit a low-rank tensor-train deep neural network (TT-DNN) to build an end-to-end deep learning pipeline, namely LR-TT-DNN.
A hybrid model combining LR-TT-DNN with a convolutional neural network (CNN) is set up to boost the performance.
Our empirical evidence demonstrates that the LR-TT-DNN and CNN+(LR-TT-DNN) models with fewer model parameters can outperform the TT-DNN and CNN+(LR-TT-DNN) counterparts.
arXiv Detail & Related papers (2022-03-11T15:55:34Z) - Towards Extremely Compact RNNs for Video Recognition with Fully
Decomposed Hierarchical Tucker Structure [41.41516453160845]
We propose to develop extremely compact RNN models with fully decomposed hierarchical Tucker (FDHT) structure.
Our experimental results on several popular video recognition datasets show that our proposed fully decomposed hierarchical tucker-based LSTM is extremely compact and highly efficient.
arXiv Detail & Related papers (2021-04-12T18:40:44Z) - Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain.
In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden.
Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z) - A Fully Tensorized Recurrent Neural Network [48.50376453324581]
We introduce a "fully tensorized" RNN architecture which jointly encodes the separate weight matrices within each recurrent cell.
This approach reduces model size by several orders of magnitude, while still maintaining similar or better performance compared to standard RNNs.
arXiv Detail & Related papers (2020-10-08T18:24:12Z) - Recurrent Graph Tensor Networks: A Low-Complexity Framework for
Modelling High-Dimensional Multi-Way Sequence [24.594587557319837]
We develop a graph filter framework for approximating the modelling of hidden states in Recurrent Neural Networks (RNNs)
The proposed framework is validated through several multi-way sequence modelling tasks and benchmarked against traditional RNNs.
We show that the proposed RGTN is capable of not only out-performing standard RNNs, but also mitigating the Curse of Dimensionality associated with traditional RNNs.
arXiv Detail & Related papers (2020-09-18T10:13:36Z) - Accurate and Lightweight Image Super-Resolution with Model-Guided Deep
Unfolding Network [63.69237156340457]
We present and advocate an explainable approach toward SISR named model-guided deep unfolding network (MoG-DUN)
MoG-DUN is accurate (producing fewer aliasing artifacts), computationally efficient (with reduced model parameters), and versatile (capable of handling multiple degradations)
The superiority of the proposed MoG-DUN method to existing state-of-theart image methods including RCAN, SRDNF, and SRFBN is substantiated by extensive experiments on several popular datasets and various degradation scenarios.
arXiv Detail & Related papers (2020-09-14T08:23:37Z) - Lightweight image super-resolution with enhanced CNN [82.36883027158308]
Deep convolutional neural networks (CNNs) with strong expressive ability have achieved impressive performances on single image super-resolution (SISR)
We propose a lightweight enhanced SR CNN (LESRCNN) with three successive sub-blocks, an information extraction and enhancement block (IEEB), a reconstruction block (RB) and an information refinement block (IRB)
IEEB extracts hierarchical low-resolution (LR) features and aggregates the obtained features step-by-step to increase the memory ability of the shallow layers on deep layers for SISR.
RB converts low-frequency features into high-frequency features by fusing global
arXiv Detail & Related papers (2020-07-08T18:03:40Z) - Hybrid Tensor Decomposition in Neural Network Compression [13.146051056642904]
We introduce the hierarchical Tucker (HT) decomposition method to investigate its capability in neural network compression.
We experimentally discover that the HT format has better performance on compressing weight matrices, while the TT format is more suited for compressing convolutional kernels.
arXiv Detail & Related papers (2020-06-29T11:16:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.