Block-term Tensor Neural Networks
- URL: http://arxiv.org/abs/2010.04963v2
- Date: Fri, 18 Dec 2020 06:47:53 GMT
- Title: Block-term Tensor Neural Networks
- Authors: Jinmian Ye, Guangxi Li, Di Chen, Haiqin Yang, Shandian Zhe, and
Zenglin Xu
- Abstract summary: We show that block-term tensor layers (BT-layers) can be easily adapted to neural network models, such as CNNs and RNNs.
BT-layers in CNNs and RNNs can achieve a very large compression ratio on the number of parameters while preserving or improving the representation power of the original DNNs.
- Score: 29.442026567710435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) have achieved outstanding performance in a wide
range of applications, e.g., image classification, natural language processing,
etc. Despite the good performance, the huge number of parameters in DNNs brings
challenges to efficient training of DNNs and also their deployment in low-end
devices with limited computing resources. In this paper, we explore the
correlations in the weight matrices, and approximate the weight matrices with
the low-rank block-term tensors. We name the new corresponding structure as
block-term tensor layers (BT-layers), which can be easily adapted to neural
network models, such as CNNs and RNNs. In particular, the inputs and the
outputs in BT-layers are reshaped into low-dimensional high-order tensors with
a similar or improved representation power. Sufficient experiments have
demonstrated that BT-layers in CNNs and RNNs can achieve a very large
compression ratio on the number of parameters while preserving or improving the
representation power of the original DNNs.
Related papers
- Variational Tensor Neural Networks for Deep Learning [0.0]
We propose an integration of tensor networks (TN) into deep neural networks (NNs)
This in turn, results in a scalable tensor neural network (TNN) architecture capable of efficient training over a large parameter space.
We validate the accuracy and efficiency of our method by designing TNN models and providing benchmark results for linear and non-linear regressions, data classification and image recognition on MNIST handwritten digits.
arXiv Detail & Related papers (2022-11-26T20:24:36Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Sparsifying Binary Networks [3.8350038566047426]
Binary neural networks (BNNs) have demonstrated their ability to solve complex tasks with comparable accuracy as full-precision deep neural networks (DNNs)
Despite the recent improvements, they suffer from a fixed and limited compression factor that may result insufficient for certain devices with very limited resources.
We propose sparse binary neural networks (SBNNs), a novel model and training scheme which introduces sparsity in BNNs and a new quantization function for binarizing the network's weights.
arXiv Detail & Related papers (2022-07-11T15:54:41Z) - Exploiting Low-Rank Tensor-Train Deep Neural Networks Based on
Riemannian Gradient Descent With Illustrations of Speech Processing [74.31472195046099]
We exploit a low-rank tensor-train deep neural network (TT-DNN) to build an end-to-end deep learning pipeline, namely LR-TT-DNN.
A hybrid model combining LR-TT-DNN with a convolutional neural network (CNN) is set up to boost the performance.
Our empirical evidence demonstrates that the LR-TT-DNN and CNN+(LR-TT-DNN) models with fewer model parameters can outperform the TT-DNN and CNN+(LR-TT-DNN) counterparts.
arXiv Detail & Related papers (2022-03-11T15:55:34Z) - Sub-bit Neural Networks: Learning to Compress and Accelerate Binary
Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs.
SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space.
Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z) - An Alternative Practice of Tropical Convolution to Traditional
Convolutional Neural Networks [0.5837881923712392]
We propose a new type of CNNs called Tropical Convolutional Neural Networks (TCNNs)
TCNNs are built on tropical convolutions in which the multiplications and additions in conventional convolutional layers are replaced by additions and min/max operations respectively.
We show that TCNN can achieve higher expressive power than ordinary convolutional layers on the MNIST and CIFAR10 image data set.
arXiv Detail & Related papers (2021-03-03T00:13:30Z) - Encoding the latent posterior of Bayesian Neural Networks for
uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures.
We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer.
Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z) - A Fully Tensorized Recurrent Neural Network [48.50376453324581]
We introduce a "fully tensorized" RNN architecture which jointly encodes the separate weight matrices within each recurrent cell.
This approach reduces model size by several orders of magnitude, while still maintaining similar or better performance compared to standard RNNs.
arXiv Detail & Related papers (2020-10-08T18:24:12Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Tensor train decompositions on recurrent networks [60.334946204107446]
Matrix product state (MPS) tensor trains have more attractive features than MPOs, in terms of storage reduction and computing time at inference.
We show that MPS tensor trains should be at the forefront of LSTM network compression through a theoretical analysis and practical experiments on NLP task.
arXiv Detail & Related papers (2020-06-09T18:25:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.