Machine learning with tree tensor networks, CP rank constraints, and
tensor dropout
- URL: http://arxiv.org/abs/2305.19440v1
- Date: Tue, 30 May 2023 22:22:24 GMT
- Title: Machine learning with tree tensor networks, CP rank constraints, and
tensor dropout
- Authors: Hao Chen and Thomas Barthel
- Abstract summary: We show how tree tensor networks (TTN) with CP rank constraints and dropout tensor can be used in machine learning.
A low-rank TTN classifier with branching ratio $b=4$ reaches test set accuracy 90.3% with low computation costs.
- Score: 6.385624548310884
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tensor networks approximate order-$N$ tensors with a reduced number of
degrees of freedom that is only polynomial in $N$ and arranged as a network of
partially contracted smaller tensors. As suggested in [arXiv:2205.15296] in the
context of quantum many-body physics, computation costs can be further
substantially reduced by imposing constraints on the canonical polyadic (CP)
rank of the tensors in such networks. Here we demonstrate how tree tensor
networks (TTN) with CP rank constraints and tensor dropout can be used in
machine learning. The approach is found to outperform other tensor-network
based methods in Fashion-MNIST image classification. A low-rank TTN classifier
with branching ratio $b=4$ reaches test set accuracy 90.3\% with low
computation costs. Consisting of mostly linear elements, tensor network
classifiers avoid the vanishing gradient problem of deep neural networks. The
CP rank constraints have additional advantages: The number of parameters can be
decreased and tuned more freely to control overfitting, improve generalization
properties, and reduce computation costs. They allow us to employ trees with
large branching ratios which substantially improves the representation power.
Related papers
- Tensor cumulants for statistical inference on invariant distributions [49.80012009682584]
We show that PCA becomes computationally hard at a critical value of the signal's magnitude.
We define a new set of objects, which provide an explicit, near-orthogonal basis for invariants of a given degree.
It also lets us analyze a new problem of distinguishing between different ensembles.
arXiv Detail & Related papers (2024-04-29T14:33:24Z) - The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich
Regimes [75.59720049837459]
We study the transition from infinite-width behavior to this variance limited regime as a function of sample size $P$ and network width $N$.
We find that finite-size effects can become relevant for very small datasets on the order of $P* sim sqrtN$ for regression with ReLU networks.
arXiv Detail & Related papers (2022-12-23T04:48:04Z) - Near-Linear Time and Fixed-Parameter Tractable Algorithms for Tensor
Decompositions [51.19236668224547]
We study low rank approximation of tensors, focusing on the tensor train and Tucker decompositions.
For tensor train decomposition, we give a bicriteria $(1 + eps)$-approximation algorithm with a small bicriteria rank and $O(q cdot nnz(A))$ running time.
In addition, we extend our algorithm to tensor networks with arbitrary graphs.
arXiv Detail & Related papers (2022-07-15T11:55:09Z) - Tensor Network States with Low-Rank Tensors [6.385624548310884]
We introduce the idea of imposing low-rank constraints on the tensors that compose the tensor network.
With this modification, the time and complexities for the network optimization can be substantially reduced.
We find that choosing the tensor rank $r$ to be on the order of the bond $m$, is sufficient to obtain high-accuracy groundstate approximations.
arXiv Detail & Related papers (2022-05-30T17:58:16Z) - Low-Rank+Sparse Tensor Compression for Neural Networks [11.632913694957868]
We propose to combine low-rank tensor decomposition with sparse pruning in order to take advantage of both coarse and fine structure for compression.
We compress weights in SOTA architectures (MobileNetv3, EfficientNet, Vision Transformer) and compare this approach to sparse pruning and tensor decomposition alone.
arXiv Detail & Related papers (2021-11-02T15:55:07Z) - Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch [75.69506249886622]
Sparsity in Deep Neural Networks (DNNs) has been widely studied to compress and accelerate the models on resource-constrained environments.
In this paper, we are the first to study training from scratch an N:M fine-grained structured sparse network.
arXiv Detail & Related papers (2021-02-08T05:55:47Z) - Tensor-Train Networks for Learning Predictive Modeling of
Multidimensional Data [0.0]
A promising strategy is based on tensor networks, which have been very successful in physical and chemical applications.
We show that the weights of a multidimensional regression model can be learned by means of tensor networks with the aim of performing a powerful compact representation.
An algorithm based on alternating least squares has been proposed for approximating the weights in TT-format with a reduction of computational power.
arXiv Detail & Related papers (2021-01-22T16:14:38Z) - Beyond Lazy Training for Over-parameterized Tensor Decomposition [69.4699995828506]
We show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
Our results show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
arXiv Detail & Related papers (2020-10-22T00:32:12Z) - Towards Compact Neural Networks via End-to-End Training: A Bayesian
Tensor Approach with Automatic Rank Determination [11.173092834726528]
It is desirable to directly train a compact neural network from scratch with low memory and low computational cost.
Low-rank tensor decomposition is one of the most effective approaches to reduce the memory and computing requirements of large-size neural networks.
This paper presents a novel end-to-end framework for low-rank tensorized training of neural networks.
arXiv Detail & Related papers (2020-10-17T01:23:26Z) - T-Basis: a Compact Representation for Neural Networks [89.86997385827055]
We introduce T-Basis, a concept for a compact representation of a set of tensors, each of an arbitrary shape, which is often seen in Neural Networks.
We evaluate the proposed approach on the task of neural network compression and demonstrate that it reaches high compression rates at acceptable performance drops.
arXiv Detail & Related papers (2020-07-13T19:03:22Z) - Entanglement and Tensor Networks for Supervised Image Classification [0.0]
We revisit the use of tensor networks for supervised image classification using the MNIST data set of digits of handwritten.
We propose a plausible candidate state $|Sigma_ellrangle$ and investigate its entanglement properties.
We conclude that $|Sigma_ellrangle$ is so robustly entangled that it cannot be approximated by the tensor network used in that work.
arXiv Detail & Related papers (2020-07-12T20:09:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.