Related papers: Machine learning with tree tensor networks, CP rank constraints, and tensor dropout

Machine learning with tree tensor networks, CP rank constraints, and tensor dropout

URL: http://arxiv.org/abs/2305.19440v1
Date: Tue, 30 May 2023 22:22:24 GMT
Title: Machine learning with tree tensor networks, CP rank constraints, and tensor dropout
Authors: Hao Chen and Thomas Barthel
Abstract summary: We show how tree tensor networks (TTN) with CP rank constraints and dropout tensor can be used in machine learning. A low-rank TTN classifier with branching ratio $b=4$ reaches test set accuracy 90.3% with low computation costs.
Score: 6.385624548310884
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tensor networks approximate order-$N$ tensors with a reduced number of degrees of freedom that is only polynomial in $N$ and arranged as a network of partially contracted smaller tensors. As suggested in [arXiv:2205.15296] in the context of quantum many-body physics, computation costs can be further substantially reduced by imposing constraints on the canonical polyadic (CP) rank of the tensors in such networks. Here we demonstrate how tree tensor networks (TTN) with CP rank constraints and tensor dropout can be used in machine learning. The approach is found to outperform other tensor-network based methods in Fashion-MNIST image classification. A low-rank TTN classifier with branching ratio $b=4$ reaches test set accuracy 90.3\% with low computation costs. Consisting of mostly linear elements, tensor network classifiers avoid the vanishing gradient problem of deep neural networks. The CP rank constraints have additional advantages: The number of parameters can be decreased and tuned more freely to control overfitting, improve generalization properties, and reduce computation costs. They allow us to employ trees with large branching ratios which substantially improves the representation power.

Related papers

Tensor Decomposition Networks for Fast Machine Learning Interatomic Potential Computations [63.945006006152035]
tensor decomposition networks (TDNs) achieve competitive performance with dramatic speedup in computations.<n>We evaluate TDNs on PubChemQCR, a newly curated molecular relaxation dataset containing 105 million DFT-calculated snapshots.
arXiv Detail & Related papers (2025-07-01T18:46:27Z)
Tensor Convolutional Network for Higher-Order Interaction Prediction in Sparse Tensors [74.31355755781343]
We propose TCN, an accurate and compatible tensor convolutional network that integrates seamlessly with TF methods for predicting top-k interactions. We show that TCN integrated with a TF method outperforms competitors, including TF methods and a hyperedge prediction method.
arXiv Detail & Related papers (2025-03-14T18:22:20Z)
Tensor cumulants for statistical inference on invariant distributions [49.80012009682584]
We show that PCA becomes computationally hard at a critical value of the signal's magnitude. We define a new set of objects, which provide an explicit, near-orthogonal basis for invariants of a given degree. It also lets us analyze a new problem of distinguishing between different ensembles.
arXiv Detail & Related papers (2024-04-29T14:33:24Z)
The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich Regimes [75.59720049837459]
We study the transition from infinite-width behavior to this variance limited regime as a function of sample size $P$ and network width $N$. We find that finite-size effects can become relevant for very small datasets on the order of $P* sim sqrtN$ for regression with ReLU networks.
arXiv Detail & Related papers (2022-12-23T04:48:04Z)
Near-Linear Time and Fixed-Parameter Tractable Algorithms for Tensor Decompositions [51.19236668224547]
We study low rank approximation of tensors, focusing on the tensor train and Tucker decompositions. For tensor train decomposition, we give a bicriteria $(1 + eps)$-approximation algorithm with a small bicriteria rank and $O(q cdot nnz(A))$ running time. In addition, we extend our algorithm to tensor networks with arbitrary graphs.
arXiv Detail & Related papers (2022-07-15T11:55:09Z)
Tensor Network States with Low-Rank Tensors [6.385624548310884]
We introduce the idea of imposing low-rank constraints on the tensors that compose the tensor network. With this modification, the time and complexities for the network optimization can be substantially reduced. We find that choosing the tensor rank $r$ to be on the order of the bond $m$, is sufficient to obtain high-accuracy groundstate approximations.
arXiv Detail & Related papers (2022-05-30T17:58:16Z)
Stack operation of tensor networks [10.86105335102537]
We propose a mathematically rigorous definition for the tensor network stack approach. We illustrate the main ideas with the matrix product states based machine learning as an example.
arXiv Detail & Related papers (2022-03-28T12:45:13Z)
Low-Rank+Sparse Tensor Compression for Neural Networks [11.632913694957868]
We propose to combine low-rank tensor decomposition with sparse pruning in order to take advantage of both coarse and fine structure for compression. We compress weights in SOTA architectures (MobileNetv3, EfficientNet, Vision Transformer) and compare this approach to sparse pruning and tensor decomposition alone.
arXiv Detail & Related papers (2021-11-02T15:55:07Z)
Tensor-Train Networks for Learning Predictive Modeling of Multidimensional Data [0.0]
A promising strategy is based on tensor networks, which have been very successful in physical and chemical applications. We show that the weights of a multidimensional regression model can be learned by means of tensor networks with the aim of performing a powerful compact representation. An algorithm based on alternating least squares has been proposed for approximating the weights in TT-format with a reduction of computational power.
arXiv Detail & Related papers (2021-01-22T16:14:38Z)
Beyond Lazy Training for Over-parameterized Tensor Decomposition [69.4699995828506]
We show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data. Our results show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
arXiv Detail & Related papers (2020-10-22T00:32:12Z)
Towards Compact Neural Networks via End-to-End Training: A Bayesian Tensor Approach with Automatic Rank Determination [11.173092834726528]
It is desirable to directly train a compact neural network from scratch with low memory and low computational cost. Low-rank tensor decomposition is one of the most effective approaches to reduce the memory and computing requirements of large-size neural networks. This paper presents a novel end-to-end framework for low-rank tensorized training of neural networks.
arXiv Detail & Related papers (2020-10-17T01:23:26Z)
T-Basis: a Compact Representation for Neural Networks [89.86997385827055]
We introduce T-Basis, a concept for a compact representation of a set of tensors, each of an arbitrary shape, which is often seen in Neural Networks. We evaluate the proposed approach on the task of neural network compression and demonstrate that it reaches high compression rates at acceptable performance drops.
arXiv Detail & Related papers (2020-07-13T19:03:22Z)
Entanglement and Tensor Networks for Supervised Image Classification [0.0]
We revisit the use of tensor networks for supervised image classification using the MNIST data set of digits of handwritten. We propose a plausible candidate state $|Sigma_ellrangle$ and investigate its entanglement properties. We conclude that $|Sigma_ellrangle$ is so robustly entangled that it cannot be approximated by the tensor network used in that work.
arXiv Detail & Related papers (2020-07-12T20:09:26Z)
Deep Polynomial Neural Networks [77.70761658507507]
$Pi$Nets are a new class of function approximators based on expansions. $Pi$Nets produce state-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning.
arXiv Detail & Related papers (2020-06-20T16:23:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.