Langevin dynamics for high-dimensional optimization: the case of multi-spiked tensor PCA
- URL: http://arxiv.org/abs/2408.06401v2
- Date: Thu, 19 Dec 2024 09:30:05 GMT
- Title: Langevin dynamics for high-dimensional optimization: the case of multi-spiked tensor PCA
- Authors: Gérard Ben Arous, Cédric Gerbelot, Vanessa Piccolo,
- Abstract summary: We show that the sample complexity required for recovering the spike associated with the largest SNR matches the well-known algorithmic threshold for the single-spike case.
As a key step, we provide a detailed characterization of the spikes and interactions that capture the high-dimensional trajectory dynamics.
- Score: 8.435118770300999
- License:
- Abstract: We study nonconvex optimization in high dimensions through Langevin dynamics, focusing on the multi-spiked tensor PCA problem. This tensor estimation problem involves recovering $r$ hidden signal vectors (spikes) from noisy Gaussian tensor observations using maximum likelihood estimation. We study the number of samples required for Langevin dynamics to efficiently recover the spikes and determine the necessary separation condition on the signal-to-noise ratios (SNRs) for exact recovery, distinguishing the cases $p \ge 3$ and $p=2$, where $p$ denotes the order of the tensor. In particular, we show that the sample complexity required for recovering the spike associated with the largest SNR matches the well-known algorithmic threshold for the single-spike case, while this threshold degrades when recovering all $r$ spikes. As a key step, we provide a detailed characterization of the trajectory and interactions of low-dimensional projections that capture the high-dimensional dynamics.
Related papers
- Guaranteed Nonconvex Low-Rank Tensor Estimation via Scaled Gradient Descent [4.123899820318987]
This paper develops a scaled gradient descent (ScaledGD) algorithm to directly estimate the tensor factors.
In theory, ScaledGD achieves linear convergence at a constant rate that is independent of the condition number of the ground truth low-rank tensor.
numerical examples are provided to demonstrate the efficacy of ScaledGD in accelerating the convergence rate of ill-conditioned low-rank tensor estimation.
arXiv Detail & Related papers (2025-01-03T08:26:01Z) - Permutation recovery of spikes in noisy high-dimensional tensor estimation [8.435118770300999]
We study the dynamics of flow in high dimensions for the multi-spiked tensor problem.
Our work builds on our companion paper [Ben A, Gerbelot, Langevin], which determines the sample separation conditions for the SNRs necessary for ensuring exact recovery.
arXiv Detail & Related papers (2024-12-19T08:59:49Z) - Stochastic gradient descent in high dimensions for multi-spiked tensor PCA [8.435118770300999]
We study the dynamics in high dimensions of online gradient descent for the multi-spiked tensor model.
We find that the spikes are recovered sequentially in a process we term "sequential elimination"
arXiv Detail & Related papers (2024-10-23T17:20:41Z) - Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems [56.86067111855056]
We consider clipped optimization problems with heavy-tailed noise with structured density.
We show that it is possible to get faster rates of convergence than $mathcalO(K-(alpha - 1)/alpha)$, when the gradients have finite moments of order.
We prove that the resulting estimates have negligible bias and controllable variance.
arXiv Detail & Related papers (2023-11-07T17:39:17Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Towards Sample-Optimal Compressive Phase Retrieval with Sparse and
Generative Priors [59.33977545294148]
We show that $O(k log L)$ samples suffice to guarantee that the signal is close to any vector that minimizes an amplitude-based empirical loss function.
We adapt this result to sparse phase retrieval, and show that $O(s log n)$ samples are sufficient for a similar guarantee when the underlying signal is $s$-sparse and $n$-dimensional.
arXiv Detail & Related papers (2021-06-29T12:49:54Z) - Lattice partition recovery with dyadic CART [79.96359947166592]
We study piece-wise constant signals corrupted by additive Gaussian noise over a $d$-dimensional lattice.
Data of this form naturally arise in a host of applications, and the tasks of signal detection or testing, de-noising and estimation have been studied extensively in the statistical and signal processing literature.
In this paper we consider instead the problem of partition recovery, i.e.of estimating the partition of the lattice induced by the constancy regions of the unknown signal.
We prove that a DCART-based procedure consistently estimates the underlying partition at a rate of order $sigma2 k*
arXiv Detail & Related papers (2021-05-27T23:41:01Z) - Faster Convergence of Stochastic Gradient Langevin Dynamics for
Non-Log-Concave Sampling [110.88857917726276]
We provide a new convergence analysis of gradient Langevin dynamics (SGLD) for sampling from a class of distributions that can be non-log-concave.
At the core of our approach is a novel conductance analysis of SGLD using an auxiliary time-reversible Markov Chain.
arXiv Detail & Related papers (2020-10-19T15:23:18Z) - Spatio-Temporal Tensor Sketching via Adaptive Sampling [15.576219771198389]
We propose SkeTenSmooth, a novel tensor factorization framework that uses adaptive sampling to compress tensor slices in a temporally streaming fashion.
Experiments on the New York City Yellow Taxi data show that SkeTenSmooth greatly reduces the memory cost and random sampling rate.
arXiv Detail & Related papers (2020-06-21T23:55:10Z) - The Heavy-Tail Phenomenon in SGD [7.366405857677226]
We show that depending on the structure of the Hessian of the loss at the minimum, the SGD iterates will converge to a emphheavy-tailed stationary distribution.
We translate our results into insights about the behavior of SGD in deep learning.
arXiv Detail & Related papers (2020-06-08T16:43:56Z) - Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient
Clipping [69.9674326582747]
We propose a new accelerated first-order method called clipped-SSTM for smooth convex optimization with heavy-tailed distributed noise in gradients.
We prove new complexity that outperform state-of-the-art results in this case.
We derive the first non-trivial high-probability complexity bounds for SGD with clipping without light-tails assumption on the noise.
arXiv Detail & Related papers (2020-05-21T17:05:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.