Related papers: Efficiency of First-Order Methods for Low-Rank Tensor Recovery with the Tensor Nuclear Norm Under Strict Complementarity

Efficiency of First-Order Methods for Low-Rank Tensor Recovery with the Tensor Nuclear Norm Under Strict Complementarity

URL: http://arxiv.org/abs/2308.01677v1
Date: Thu, 3 Aug 2023 10:31:22 GMT
Title: Efficiency of First-Order Methods for Low-Rank Tensor Recovery with the Tensor Nuclear Norm Under Strict Complementarity
Authors: Dan Garber, Atara Kaplan
Abstract summary: We consider convex relaxations for recovering low-rank tensors based on constrained iteration over a ball induced by the tensor nuclear norm. Under a strict complementarity condition, both the convergence rate and per-it runtime of standard gradient methods may improve dramatically.
Score: 19.930021245647705
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider convex relaxations for recovering low-rank tensors based on constrained minimization over a ball induced by the tensor nuclear norm, recently introduced in \cite{tensor_tSVD}. We build on a recent line of results that considered convex relaxations for the recovery of low-rank matrices and established that under a strict complementarity condition (SC), both the convergence rate and per-iteration runtime of standard gradient methods may improve dramatically. We develop the appropriate strict complementarity condition for the tensor nuclear norm ball and obtain the following main results under this condition: 1. When the objective to minimize is of the form $f(\mX)=g(\mA\mX)+\langle{\mC,\mX}\rangle$ , where $g$ is strongly convex and $\mA$ is a linear map (e.g., least squares), a quadratic growth bound holds, which implies linear convergence rates for standard projected gradient methods, despite the fact that $f$ need not be strongly convex. 2. For a smooth objective function, when initialized in certain proximity of an optimal solution which satisfies SC, standard projected gradient methods only require SVD computations (for projecting onto the tensor nuclear norm ball) of rank that matches the tubal rank of the optimal solution. In particular, when the tubal rank is constant, this implies nearly linear (in the size of the tensor) runtime per iteration, as opposed to super linear without further assumptions. 3. For a nonsmooth objective function which admits a popular smooth saddle-point formulation, we derive similar results to the latter for the well known extragradient method. An additional contribution which may be of independent interest, is the rigorous extension of many basic results regarding tensors of arbitrary order, which were previously obtained only for third-order tensors.

Related papers

Low-Rank Tensor Learning by Generalized Nonconvex Regularization [25.115066273660478]
We study the problem of low-rank tensor learning, where only a few samples are observed the underlying tensor. A family of non tensor learning functions are employed to characterize the low-rankness of the underlying tensor. An algorithm designed to solve the resulting majorization-minimization is proposed.
arXiv Detail & Related papers (2024-10-24T03:33:20Z)
Large Deviations and Improved Mean-squared Error Rates of Nonlinear SGD: Heavy-tailed Noise and Power of Symmetry [47.653744900375855]
We study large deviations and mean-squared error (MSE) guarantees a general framework of nonlinear convex gradient methods in the online setting. We provide strong results for broad bound step-sizes in the presence of heavy noise symmetric density function.
arXiv Detail & Related papers (2024-10-21T04:50:57Z)
Low-Rank Tensor Completion via Novel Sparsity-Inducing Regularizers [30.920908325825668]
To alleviate l1-norm in the low-rank tensor completion problem, non-rank surrogates/regularizers have been suggested. These regularizers are applied to nuclear-rank restoration, and efficient algorithms based on the method of multipliers are proposed.
arXiv Detail & Related papers (2023-10-10T01:00:13Z)
Convex and Non-convex Optimization Under Generalized Smoothness [69.69521650503431]
An analysis of convex and non- optimization methods often requires the Lipsitzness gradient, which limits the analysis by this trajectorys. Recent work generalizes the gradient setting via the non-uniform smoothness condition.
arXiv Detail & Related papers (2023-06-02T04:21:59Z)
A Novel Tensor Factorization-Based Method with Robustness to Inaccurate Rank Estimation [9.058215418134209]
We propose a new tensor norm with a dual low-rank constraint, which utilizes the low-rank prior and rank information at the same time. It is proven theoretically that the resulting tensor completion model can effectively avoid performance degradation caused by inaccurate rank estimation. Based on this, the total cost at each iteration of the optimization algorithm is reduced to $mathcalO(n3log n +kn3)$ from $mathcalO(n4)$ achieved with standard methods.
arXiv Detail & Related papers (2023-05-19T06:26:18Z)
Kernel-based off-policy estimation without overlap: Instance optimality beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data. For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z)
Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$. We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z)
High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails [55.561406656549686]
We consider non- Hilbert optimization using first-order algorithms for which the gradient estimates may have tails. We show that a combination of gradient, momentum, and normalized gradient descent convergence to critical points in high-probability with best-known iteration for smooth losses.
arXiv Detail & Related papers (2021-06-28T00:17:01Z)
High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability. Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level. We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z)
Leveraging Non-uniformity in First-order Non-convex Optimization [93.6817946818977]
Non-uniform refinement of objective functions leads to emphNon-uniform Smoothness (NS) and emphNon-uniform Lojasiewicz inequality (NL) New definitions inspire new geometry-aware first-order methods that converge to global optimality faster than the classical $Omega (1/t2)$ lower bounds.
arXiv Detail & Related papers (2021-05-13T04:23:07Z)
A Sharp Blockwise Tensor Perturbation Bound for Orthogonal Iteration [23.308822706415867]
We establish blockwise tensor perturbation bounds for the high-order orthogonal iteration (HOOI) We show the upper bounds of mode-$k$ singular subspace estimation are unilateral and converge linearly to a quantity characterized by blockwise errors of the perturbation and signal strength. We prove that one-step HOOI with only a single iteration is also optimal in terms of tensor reconstruction and can be used to lower the computational cost.
arXiv Detail & Related papers (2020-08-06T03:01:28Z)
Tensor completion via nonconvex tensor ring rank minimization with guaranteed convergence [16.11872681638052]
In recent studies, the tensor ring (TR) rank has shown high effectiveness in tensor completion. A recently proposed TR rank is based on capturing the structure within the weighted sum penalizing the singular value equally. In this paper, we propose to use the logdet-based function as a non smooth relaxation for solutions practice.
arXiv Detail & Related papers (2020-05-14T03:13:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.