Machine learning with tree tensor networks, CP rank constraints, and
tensor dropout
- URL: http://arxiv.org/abs/2305.19440v1
- Date: Tue, 30 May 2023 22:22:24 GMT
- Title: Machine learning with tree tensor networks, CP rank constraints, and
tensor dropout
- Authors: Hao Chen and Thomas Barthel
- Abstract summary: We show how tree tensor networks (TTN) with CP rank constraints and dropout tensor can be used in machine learning.
A low-rank TTN classifier with branching ratio $b=4$ reaches test set accuracy 90.3% with low computation costs.
- Score: 6.385624548310884
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tensor networks approximate order-$N$ tensors with a reduced number of
degrees of freedom that is only polynomial in $N$ and arranged as a network of
partially contracted smaller tensors. As suggested in [arXiv:2205.15296] in the
context of quantum many-body physics, computation costs can be further
substantially reduced by imposing constraints on the canonical polyadic (CP)
rank of the tensors in such networks. Here we demonstrate how tree tensor
networks (TTN) with CP rank constraints and tensor dropout can be used in
machine learning. The approach is found to outperform other tensor-network
based methods in Fashion-MNIST image classification. A low-rank TTN classifier
with branching ratio $b=4$ reaches test set accuracy 90.3\% with low
computation costs. Consisting of mostly linear elements, tensor network
classifiers avoid the vanishing gradient problem of deep neural networks. The
CP rank constraints have additional advantages: The number of parameters can be
decreased and tuned more freely to control overfitting, improve generalization
properties, and reduce computation costs. They allow us to employ trees with
large branching ratios which substantially improves the representation power.
Related papers
- Tensor cumulants for statistical inference on invariant distributions [49.80012009682584]
We show that PCA becomes computationally hard at a critical value of the signal's magnitude.
We define a new set of objects, which provide an explicit, near-orthogonal basis for invariants of a given degree.
It also lets us analyze a new problem of distinguishing between different ensembles.
arXiv Detail & Related papers (2024-04-29T14:33:24Z) - The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich
Regimes [75.59720049837459]
We study the transition from infinite-width behavior to this variance limited regime as a function of sample size $P$ and network width $N$.
We find that finite-size effects can become relevant for very small datasets on the order of $P* sim sqrtN$ for regression with ReLU networks.
arXiv Detail & Related papers (2022-12-23T04:48:04Z) - Near-Linear Time and Fixed-Parameter Tractable Algorithms for Tensor
Decompositions [51.19236668224547]
We study low rank approximation of tensors, focusing on the tensor train and Tucker decompositions.
For tensor train decomposition, we give a bicriteria $(1 + eps)$-approximation algorithm with a small bicriteria rank and $O(q cdot nnz(A))$ running time.
In addition, we extend our algorithm to tensor networks with arbitrary graphs.
arXiv Detail & Related papers (2022-07-15T11:55:09Z) - Tensor Network States with Low-Rank Tensors [6.385624548310884]
We introduce the idea of imposing low-rank constraints on the tensors that compose the tensor network.
With this modification, the time and complexities for the network optimization can be substantially reduced.
We find that choosing the tensor rank $r$ to be on the order of the bond $m$, is sufficient to obtain high-accuracy groundstate approximations.
arXiv Detail & Related papers (2022-05-30T17:58:16Z) - Stack operation of tensor networks [10.86105335102537]
We propose a mathematically rigorous definition for the tensor network stack approach.
We illustrate the main ideas with the matrix product states based machine learning as an example.
arXiv Detail & Related papers (2022-03-28T12:45:13Z) - Beyond Lazy Training for Over-parameterized Tensor Decomposition [69.4699995828506]
We show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
Our results show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
arXiv Detail & Related papers (2020-10-22T00:32:12Z) - Towards Compact Neural Networks via End-to-End Training: A Bayesian
Tensor Approach with Automatic Rank Determination [11.173092834726528]
It is desirable to directly train a compact neural network from scratch with low memory and low computational cost.
Low-rank tensor decomposition is one of the most effective approaches to reduce the memory and computing requirements of large-size neural networks.
This paper presents a novel end-to-end framework for low-rank tensorized training of neural networks.
arXiv Detail & Related papers (2020-10-17T01:23:26Z) - T-Basis: a Compact Representation for Neural Networks [89.86997385827055]
We introduce T-Basis, a concept for a compact representation of a set of tensors, each of an arbitrary shape, which is often seen in Neural Networks.
We evaluate the proposed approach on the task of neural network compression and demonstrate that it reaches high compression rates at acceptable performance drops.
arXiv Detail & Related papers (2020-07-13T19:03:22Z) - Entanglement and Tensor Networks for Supervised Image Classification [0.0]
We revisit the use of tensor networks for supervised image classification using the MNIST data set of digits of handwritten.
We propose a plausible candidate state $|Sigma_ellrangle$ and investigate its entanglement properties.
We conclude that $|Sigma_ellrangle$ is so robustly entangled that it cannot be approximated by the tensor network used in that work.
arXiv Detail & Related papers (2020-07-12T20:09:26Z) - Deep Polynomial Neural Networks [77.70761658507507]
$Pi$Nets are a new class of function approximators based on expansions.
$Pi$Nets produce state-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning.
arXiv Detail & Related papers (2020-06-20T16:23:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.