Related papers: How to Train Unstable Looped Tensor Network

How to Train Unstable Looped Tensor Network

URL: http://arxiv.org/abs/2203.02617v1
Date: Sat, 5 Mar 2022 00:17:04 GMT
Title: How to Train Unstable Looped Tensor Network
Authors: Anh-Huy Phan, Konstantin Sobolev, Dmitry Ermilov, Igor Vorona, Nikolay Kozyrskiy, Petr Tichavsky and Andrzej Cichocki
Abstract summary: A rising problem in the compression of Deep Neural Networks is how to reduce the number of parameters in convolutional kernels. We propose novel methods to gain the stability of the decomposition results, keep the network robust and attain better approximation.
Score: 21.882898731132443
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A rising problem in the compression of Deep Neural Networks is how to reduce the number of parameters in convolutional kernels and the complexity of these layers by low-rank tensor approximation. Canonical polyadic tensor decomposition (CPD) and Tucker tensor decomposition (TKD) are two solutions to this problem and provide promising results. However, CPD often fails due to degeneracy, making the networks unstable and hard to fine-tune. TKD does not provide much compression if the core tensor is big. This motivates using a hybrid model of CPD and TKD, a decomposition with multiple Tucker models with small core tensor, known as block term decomposition (BTD). This paper proposes a more compact model that further compresses the BTD by enforcing core tensors in BTD identical. We establish a link between the BTD with shared parameters and a looped chain tensor network (TC). Unfortunately, such strongly constrained tensor networks (with loop) encounter severe numerical instability, as proved by y (Landsberg, 2012) and (Handschuh, 2015a). We study perturbation of chain tensor networks, provide interpretation of instability in TC, demonstrate the problem. We propose novel methods to gain the stability of the decomposition results, keep the network robust and attain better approximation. Experimental results will confirm the superiority of the proposed methods in compression of well-known CNNs, and TC decomposition under challenging scenarios

Related papers

Tensor Decomposition Networks for Fast Machine Learning Interatomic Potential Computations [63.945006006152035]
tensor decomposition networks (TDNs) achieve competitive performance with dramatic speedup in computations.<n>We evaluate TDNs on PubChemQCR, a newly curated molecular relaxation dataset containing 105 million DFT-calculated snapshots.
arXiv Detail & Related papers (2025-07-01T18:46:27Z)
Score-Based Model for Low-Rank Tensor Recovery [49.158601255093416]
Low-rank tensor decompositions (TDs) provide an effective framework for multiway data analysis.<n>Traditional TD methods rely on predefined structural assumptions, such as CP or Tucker decompositions.<n>We propose a score-based model that eliminates the need for predefined structural or distributional assumptions.
arXiv Detail & Related papers (2025-06-27T15:05:37Z)
Low-Rank Implicit Neural Representation via Schatten-p Quasi-Norm and Jacobian Regularization [49.158601255093416]
We propose a CP-based low-rank tensor function parameterized by neural networks for implicit neural representation.<n>For smoothness, we propose a regularization term based on the spectral norm of the Jacobian and Hutchinson's trace estimator.<n>Our proposed smoothness regularization is SVD-free and avoids explicit chain rule derivations.
arXiv Detail & Related papers (2025-06-27T11:23:10Z)
Tensor Convolutional Network for Higher-Order Interaction Prediction in Sparse Tensors [74.31355755781343]
We propose TCN, an accurate and compatible tensor convolutional network that integrates seamlessly with TF methods for predicting top-k interactions. We show that TCN integrated with a TF method outperforms competitors, including TF methods and a hyperedge prediction method.
arXiv Detail & Related papers (2025-03-14T18:22:20Z)
"Lossless" Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach [49.744093838327615]
We provide a novel compression approach to wide and fully-connected emphdeep neural nets. Experiments on both synthetic and real-world data are conducted to support the advantages of the proposed compression scheme.
arXiv Detail & Related papers (2024-03-01T03:46:28Z)
Activations and Gradients Compression for Model-Parallel Training [85.99744701008802]
We study how simultaneous compression of activations and gradients in model-parallel distributed training setup affects convergence. We find that gradients require milder compression rates than activations. Experiments also show that models trained with TopK perform well only when compression is also applied during inference.
arXiv Detail & Related papers (2024-01-15T15:54:54Z)
Machine learning with tree tensor networks, CP rank constraints, and tensor dropout [3.38220960870904]
We show how tree tensor networks (TTN) with CP rank constraints and tensor dropout can be used in machine learning. A low-rank TTN classifier with branching ratio $b=4$ reaches a test set accuracy of 90.3% with low costs. The number of parameters can be decreased and tuned more freely to control overfitting, improve computation properties, and reduce costs.
arXiv Detail & Related papers (2023-05-30T22:22:24Z)
Error Analysis of Tensor-Train Cross Approximation [88.83467216606778]
We provide accuracy guarantees in terms of the entire tensor for both exact and noisy measurements. Results are verified by numerical experiments, and may have important implications for the usefulness of cross approximations for high-order tensors.
arXiv Detail & Related papers (2022-07-09T19:33:59Z)
Truncated tensor Schatten p-norm based approach for spatiotemporal traffic data imputation with complicated missing patterns [77.34726150561087]
We introduce four complicated missing patterns, including missing and three fiber-like missing cases according to the mode-drivenn fibers. Despite nonity of the objective function in our model, we derive the optimal solutions by integrating alternating data-mputation method of multipliers.
arXiv Detail & Related papers (2022-05-19T08:37:56Z)
Tensor-Train Networks for Learning Predictive Modeling of Multidimensional Data [0.0]
A promising strategy is based on tensor networks, which have been very successful in physical and chemical applications. We show that the weights of a multidimensional regression model can be learned by means of tensor networks with the aim of performing a powerful compact representation. An algorithm based on alternating least squares has been proposed for approximating the weights in TT-format with a reduction of computational power.
arXiv Detail & Related papers (2021-01-22T16:14:38Z)
ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN. We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)
Kronecker CP Decomposition with Fast Multiplication for Compressing RNNs [11.01184134911405]
Recurrent neural networks (RNNs) are powerful in the tasks oriented to sequential data, such as natural language processing and video recognition. In this paper, we consider compressing RNNs based on a novel Kronecker CANDECOMP/PARAFAC (KCP) decomposition.
arXiv Detail & Related papers (2020-08-21T07:29:45Z)
Stable Low-rank Tensor Decomposition for Compression of Convolutional Neural Network [19.717842489217684]
This paper is the first study on degeneracy in the tensor decomposition of convolutional kernels. We present a novel method, which can stabilize the low-rank approximation of convolutional kernels and ensure efficient compression. We evaluate our approach on popular CNN architectures for image classification and show that our method results in much lower accuracy degradation and provides consistent performance.
arXiv Detail & Related papers (2020-08-12T17:10:12Z)
T-Basis: a Compact Representation for Neural Networks [89.86997385827055]
We introduce T-Basis, a concept for a compact representation of a set of tensors, each of an arbitrary shape, which is often seen in Neural Networks. We evaluate the proposed approach on the task of neural network compression and demonstrate that it reaches high compression rates at acceptable performance drops.
arXiv Detail & Related papers (2020-07-13T19:03:22Z)
Hybrid Tensor Decomposition in Neural Network Compression [13.146051056642904]
We introduce the hierarchical Tucker (HT) decomposition method to investigate its capability in neural network compression. We experimentally discover that the HT format has better performance on compressing weight matrices, while the TT format is more suited for compressing convolutional kernels.
arXiv Detail & Related papers (2020-06-29T11:16:22Z)
On Recoverability of Randomly Compressed Tensors with Low CP Rank [29.00634848772122]
We show that if the number of measurements is on the same order of magnitude as that of the model parameters, then the tensor is recoverable. Our proof is based on deriving a textitrestricted isometry property (R.I.P.) under the CPD model via set covering techniques.
arXiv Detail & Related papers (2020-01-08T04:44:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.