Guaranteed Nonconvex Factorization Approach for Tensor Train Recovery
- URL: http://arxiv.org/abs/2401.02592v1
- Date: Fri, 5 Jan 2024 01:17:16 GMT
- Title: Guaranteed Nonconvex Factorization Approach for Tensor Train Recovery
- Authors: Zhen Qin, Michael B. Wakin, and Zhihui Zhu
- Abstract summary: We provide the first convergence guarantee for the factorization approach.
We optimize over the so-called left-orthogonal TT format.
We prove that RGD can reliably recover the ground truth at a linear rate.
- Score: 30.876260188209105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we provide the first convergence guarantee for the
factorization approach. Specifically, to avoid the scaling ambiguity and to
facilitate theoretical analysis, we optimize over the so-called left-orthogonal
TT format which enforces orthonormality among most of the factors. To ensure
the orthonormal structure, we utilize the Riemannian gradient descent (RGD) for
optimizing those factors over the Stiefel manifold. We first delve into the TT
factorization problem and establish the local linear convergence of RGD.
Notably, the rate of convergence only experiences a linear decline as the
tensor order increases. We then study the sensing problem that aims to recover
a TT format tensor from linear measurements. Assuming the sensing operator
satisfies the restricted isometry property (RIP), we show that with a proper
initialization, which could be obtained through spectral initialization, RGD
also converges to the ground-truth tensor at a linear rate. Furthermore, we
expand our analysis to encompass scenarios involving Gaussian noise in the
measurements. We prove that RGD can reliably recover the ground truth at a
linear rate, with the recovery error exhibiting only polynomial growth in
relation to the tensor order. We conduct various experiments to validate our
theoretical findings.
Related papers
- Computational and Statistical Guarantees for Tensor-on-Tensor Regression with Tensor Train Decomposition [27.29463801531576]
We study the theoretical and algorithmic aspects of the TT-based ToT regression model.
We propose two algorithms to efficiently find solutions to constrained error bounds.
We establish the linear convergence rate of both IHT and RGD.
arXiv Detail & Related papers (2024-06-10T03:51:38Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - The Inductive Bias of Flatness Regularization for Deep Matrix
Factorization [58.851514333119255]
This work takes the first step toward understanding the inductive bias of the minimum trace of the Hessian solutions in deep linear networks.
We show that for all depth greater than one, with the standard Isometry Property (RIP) on the measurements, minimizing the trace of Hessian is approximately equivalent to minimizing the Schatten 1-norm of the corresponding end-to-end matrix parameters.
arXiv Detail & Related papers (2023-06-22T23:14:57Z) - Tensor-on-Tensor Regression: Riemannian Optimization,
Over-parameterization, Statistical-computational Gap, and Their Interplay [9.427635404752936]
We study the tensor-on-tensor regression, where the goal is to connect tensor responses to tensor covariates with a low Tucker rank parameter/matrix.
We propose two methods to cope with the challenge of unknown rank.
We provide the first convergence guarantee for the general tensor-on-tensor regression.
arXiv Detail & Related papers (2022-06-17T13:15:27Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z) - Scaling and Scalability: Provable Nonconvex Low-Rank Tensor Estimation
from Incomplete Measurements [30.395874385570007]
A fundamental task is to faithfully recover tensors from highly incomplete measurements.
We develop an algorithm to directly recover the tensor factors in the Tucker decomposition.
We show that it provably converges at a linear independent rate of the ground truth tensor for two canonical problems.
arXiv Detail & Related papers (2021-04-29T17:44:49Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.