Related papers: Overcomplete Tensor Decomposition via Koszul-Young Flattenings

Overcomplete Tensor Decomposition via Koszul-Young Flattenings

URL: http://arxiv.org/abs/2411.14344v1
Date: Thu, 21 Nov 2024 17:41:09 GMT
Title: Overcomplete Tensor Decomposition via Koszul-Young Flattenings
Authors: Pravesh K. Kothari, Ankur Moitra, Alexander S. Wein,
Abstract summary: We give a new algorithm for decomposing an $n_times n times n_3$ tensor as the sum of a minimal number of rank-1 terms. We show that an even more general class of degree-$d$s cannot surpass rank $Cn$ for a constant $C = C(d)$.
Score: 63.01248796170617
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Motivated by connections between algebraic complexity lower bounds and tensor decompositions, we investigate Koszul-Young flattenings, which are the main ingredient in recent lower bounds for matrix multiplication. Based on this tool we give a new algorithm for decomposing an $n_1 \times n_2 \times n_3$ tensor as the sum of a minimal number of rank-1 terms, and certifying uniqueness of this decomposition. For $n_1 \le n_2 \le n_3$ with $n_1 \to \infty$ and $n_3/n_2 = O(1)$, our algorithm is guaranteed to succeed when the tensor rank is bounded by $r \le (1-\epsilon)(n_2 + n_3)$ for an arbitrary $\epsilon > 0$, provided the tensor components are generically chosen. For any fixed $\epsilon$, the runtime is polynomial in $n_3$. When $n_2 = n_3 = n$, our condition on the rank gives a factor-of-2 improvement over the classical simultaneous diagonalization algorithm, which requires $r \le n$, and also improves on the recent algorithm of Koiran (2024) which requires $r \le 4n/3$. It also improves on the PhD thesis of Persu (2018) which solves rank detection for $r \leq 3n/2$. We complement our upper bounds by showing limitations, in particular that no flattening of the style we consider can surpass rank $n_2 + n_3$. Furthermore, for $n \times n \times n$ tensors, we show that an even more general class of degree-$d$ polynomial flattenings cannot surpass rank $Cn$ for a constant $C = C(d)$. This suggests that for tensor decompositions, the case of generic components may be fundamentally harder than that of random components, where efficient decomposition is possible even in highly overcomplete settings.

Related papers

Non-Convex Tensor Recovery from Local Measurements [22.688595434116777]
Motivated by the settings where sensing the entire tensor is infeasible, this paper proposes a novel tensor compressed sensing model, where measurements are only obtained from each iteration via a gradient. We prove that ScaleAlt-PGD-Alt-DMin achieves $mathcal O(log frac1epsilon)$ and improves the sample to $mathcal Oleft.
arXiv Detail & Related papers (2024-12-23T05:04:56Z)
How to Capture Higher-order Correlations? Generalizing Matrix Softmax Attention to Kronecker Computation [12.853829771559916]
We study a generalization of attention which captures triple-wise correlations. This generalization is able to solve problems about detecting triple-wise connections that were shown to be impossible for transformers. We show that our construction, algorithms, and lower bounds naturally generalize to higher-order tensors and correlations.
arXiv Detail & Related papers (2023-10-06T07:42:39Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
Solving Tensor Low Cycle Rank Approximation [15.090593955414137]
We formulate a particular tensor low rank approximation problem, we can call it tensor cycle rank. For the tensor classical rank, tucker rank and train rank, it has been well studied in [Song, Woodruff, Zhong SODA 2019]. In this paper, we generalize the previous rotation and sketch'' technique in page of [Song, Woodruff, Zhong SODA 2019] and show an input sparsity time algorithm for cycle rank.
arXiv Detail & Related papers (2023-04-13T15:00:50Z)
Average-Case Complexity of Tensor Decomposition for Low-Degree Polynomials [93.59919600451487]
"Statistical-computational gaps" occur in many statistical inference tasks. We consider a model for random order-3 decomposition where one component is slightly larger in norm than the rest. We show that tensor entries can accurately estimate the largest component when $ll n3/2$ but fail to do so when $rgg n3/2$.
arXiv Detail & Related papers (2022-11-10T00:40:37Z)
Near-Linear Time and Fixed-Parameter Tractable Algorithms for Tensor Decompositions [51.19236668224547]
We study low rank approximation of tensors, focusing on the tensor train and Tucker decompositions. For tensor train decomposition, we give a bicriteria $(1 + eps)$-approximation algorithm with a small bicriteria rank and $O(q cdot nnz(A))$ running time. In addition, we extend our algorithm to tensor networks with arbitrary graphs.
arXiv Detail & Related papers (2022-07-15T11:55:09Z)
Monogamy of entanglement between cones [68.8204255655161]
We show that monogamy is not only a feature of quantum theory, but that it characterizes the minimal tensor product of general pairs of convex cones. Our proof makes use of a new characterization of products of simplices up to affine equivalence.
arXiv Detail & Related papers (2022-06-23T16:23:59Z)
Low-Rank Approximation with $1/\epsilon^{1/3}$ Matrix-Vector Products [58.05771390012827]
We study iterative methods based on Krylov subspaces for low-rank approximation under any Schatten-$p$ norm. Our main result is an algorithm that uses only $tildeO(k/sqrtepsilon)$ matrix-vector products.
arXiv Detail & Related papers (2022-02-10T16:10:41Z)
Exact nuclear norm, completion and decomposition for random overcomplete tensors via degree-4 SOS [0.7233897166339269]
We show that simple semidefinite programs inspired by degree $4$ SOS can exactly solve the tensor nuclear norm, tensor decomposition, and tensor completion problems on tensors with random asymmetric components. We show that w.h.p. these semidefinite programs can exactly find the nuclear norm and components of an $(ntimes ntimes n)$-tensor $mathcalT$ with $mleq n3/2/polylog(n)$ random asymmetric components.
arXiv Detail & Related papers (2020-11-18T17:27:36Z)
Beyond Lazy Training for Over-parameterized Tensor Decomposition [69.4699995828506]
We show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data. Our results show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
arXiv Detail & Related papers (2020-10-22T00:32:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.