Near-Linear Time and Fixed-Parameter Tractable Algorithms for Tensor
Decompositions
- URL: http://arxiv.org/abs/2207.07417v3
- Date: Sun, 26 Nov 2023 06:10:27 GMT
- Title: Near-Linear Time and Fixed-Parameter Tractable Algorithms for Tensor
Decompositions
- Authors: Arvind V. Mahankali, David P. Woodruff, Ziyu Zhang
- Abstract summary: We study low rank approximation of tensors, focusing on the tensor train and Tucker decompositions.
For tensor train decomposition, we give a bicriteria $(1 + eps)$-approximation algorithm with a small bicriteria rank and $O(q cdot nnz(A))$ running time.
In addition, we extend our algorithm to tensor networks with arbitrary graphs.
- Score: 51.19236668224547
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study low rank approximation of tensors, focusing on the tensor train and
Tucker decompositions, as well as approximations with tree tensor networks and
more general tensor networks. For tensor train decomposition, we give a
bicriteria $(1 + \eps)$-approximation algorithm with a small bicriteria rank
and $O(q \cdot \nnz(A))$ running time, up to lower order terms, which improves
over the additive error algorithm of \cite{huber2017randomized}. We also show
how to convert the algorithm of \cite{huber2017randomized} into a relative
error algorithm, but their algorithm necessarily has a running time of $O(qr^2
\cdot \nnz(A)) + n \cdot \poly(qk/\eps)$ when converted to a $(1 +
\eps)$-approximation algorithm with bicriteria rank $r$. To the best of our
knowledge, our work is the first to achieve polynomial time relative error
approximation for tensor train decomposition. Our key technique is a method for
obtaining subspace embeddings with a number of rows polynomial in $q$ for a
matrix which is the flattening of a tensor train of $q$ tensors. We extend our
algorithm to tree tensor networks. In addition, we extend our algorithm to
tensor networks with arbitrary graphs (which we refer to as general tensor
networks), by using a result of
\cite{ms08_simulating_quantum_tensor_contraction} and showing that a general
tensor network of rank $k$ can be contracted to a binary tree network of rank
$k^{O(\deg(G)\tw(G))}$, allowing us to reduce to the case of tree tensor
networks. Finally, we give new fixed-parameter tractable algorithms for the
tensor train, Tucker, and CP decompositions, which are simpler than those of
\cite{swz19_tensor_low_rank} since they do not make use of polynomial system
solvers. Our technique of Gaussian subspace embeddings with exactly $k$ rows
(and thus exponentially small success probability) may be of independent
interest.
Related papers
- Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Solving Tensor Low Cycle Rank Approximation [15.090593955414137]
We formulate a particular tensor low rank approximation problem, we can call it tensor cycle rank.
For the tensor classical rank, tucker rank and train rank, it has been well studied in [Song, Woodruff, Zhong SODA 2019].
In this paper, we generalize the previous rotation and sketch'' technique in page of [Song, Woodruff, Zhong SODA 2019] and show an input sparsity time algorithm for cycle rank.
arXiv Detail & Related papers (2023-04-13T15:00:50Z) - Average-Case Complexity of Tensor Decomposition for Low-Degree
Polynomials [93.59919600451487]
"Statistical-computational gaps" occur in many statistical inference tasks.
We consider a model for random order-3 decomposition where one component is slightly larger in norm than the rest.
We show that tensor entries can accurately estimate the largest component when $ll n3/2$ but fail to do so when $rgg n3/2$.
arXiv Detail & Related papers (2022-11-10T00:40:37Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Fast algorithm for overcomplete order-3 tensor decomposition [7.638467133689933]
We develop the first fast spectral algorithm to decompose a random third-order tensor over Rd of rank up to O(d3/2/polylog(d))
Our algorithm can recover all components in time O(d6.05) under the current matrix multiplication time.
arXiv Detail & Related papers (2022-02-14T00:31:34Z) - The Complexity of Sparse Tensor PCA [1.90365714903665]
For any $1 leq leq k$, our algorithms recover the sparse vector for signal-to-noise ratio $lambda geq tildemathcalO (sqrtt cdot (k/t)p/2)$ in time.
Even in the restricted case of PCA, known algorithms only recover the sparse vectors for $lambda geq tildemathcalO(k cdot r) while our algorithms require $lambda ge
arXiv Detail & Related papers (2021-06-11T10:57:00Z) - Exact nuclear norm, completion and decomposition for random overcomplete
tensors via degree-4 SOS [0.7233897166339269]
We show that simple semidefinite programs inspired by degree $4$ SOS can exactly solve the tensor nuclear norm, tensor decomposition, and tensor completion problems on tensors with random asymmetric components.
We show that w.h.p. these semidefinite programs can exactly find the nuclear norm and components of an $(ntimes ntimes n)$-tensor $mathcalT$ with $mleq n3/2/polylog(n)$ random asymmetric components.
arXiv Detail & Related papers (2020-11-18T17:27:36Z) - Beyond Lazy Training for Over-parameterized Tensor Decomposition [69.4699995828506]
We show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
Our results show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
arXiv Detail & Related papers (2020-10-22T00:32:12Z) - Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK [58.5766737343951]
We consider the dynamic of descent for learning a two-layer neural network.
We show that an over-parametrized two-layer neural network can provably learn with gradient loss at most ground with Tangent samples.
arXiv Detail & Related papers (2020-07-09T07:09:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.