Related papers: Global Convergence of Adaptive Sensing for Principal Eigenvector Estimation

Global Convergence of Adaptive Sensing for Principal Eigenvector Estimation

URL: http://arxiv.org/abs/2505.10882v1
Date: Fri, 16 May 2025 05:41:11 GMT
Title: Global Convergence of Adaptive Sensing for Principal Eigenvector Estimation
Authors: Alex Saad-Falcon, Brighton Ancelin, Justin Romberg,
Abstract summary: Subspace tracking algorithms like Oja's offer more efficient alternatives but typically require full-dimensional observations.<n>We prove this adaptive sensing approach achieves global convergence in the presence of noise.<n>Results have important implications for applications where acquiring full-dimensional samples is challenging or costly.
Score: 9.170594803531866
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper addresses the challenge of efficient principal component analysis (PCA) in high-dimensional spaces by analyzing a compressively sampled variant of Oja's algorithm with adaptive sensing. Traditional PCA methods incur substantial computational costs that scale poorly with data dimensionality, whereas subspace tracking algorithms like Oja's offer more efficient alternatives but typically require full-dimensional observations. We analyze a variant where, at each iteration, only two compressed measurements are taken: one in the direction of the current estimate and one in a random orthogonal direction. We prove that this adaptive sensing approach achieves global convergence in the presence of noise when tracking the leading eigenvector of a datastream with eigengap $\Delta=\lambda_1-\lambda_2$. Our theoretical analysis demonstrates that the algorithm experiences two phases: (1) a warmup phase requiring $O(\lambda_1\lambda_2d^2/\Delta^2)$ iterations to achieve a constant-level alignment with the true eigenvector, followed by (2) a local convergence phase where the sine alignment error decays at a rate of $O(\lambda_1\lambda_2d^2/\Delta^2 t)$ for iterations $t$. The guarantee aligns with existing minimax lower bounds with an added factor of $d$ due to the compressive sampling. This work provides the first convergence guarantees in adaptive sensing for subspace tracking with noise. Our proof technique is also considerably simpler than those in prior works. The results have important implications for applications where acquiring full-dimensional samples is challenging or costly.

Related papers

MGDA Converges under Generalized Smoothness, Provably [27.87166415148172]
Multi-objective optimization (MOO) is receiving more attention in various fields such as multi-task learning.<n>Recent works provide some effective algorithms with theoretical analysis but they are limited by the standard $L$-smooth or bounded-gradient assumptions.<n>We study a more general and realistic class of generalized $ell$-smooth loss functions, where $ell$ is a general non-decreasing function of gradient norm.
arXiv Detail & Related papers (2024-05-29T18:36:59Z)
Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient [40.22217106270146]
Variance reduction techniques are designed to decrease the sampling variance, thereby accelerating convergence rates of first-order (FO) and zeroth-order (ZO) optimization methods. In composite optimization problems, ZO methods encounter an additional variance called the coordinate-wise variance, which stems from the random estimation. This paper proposes the Zeroth-order Proximal Double Variance Reduction (ZPDVR) method, which utilizes the averaging trick to reduce both sampling and coordinate-wise variances.
arXiv Detail & Related papers (2024-05-28T02:27:53Z)
Transformers as Support Vector Machines [54.642793677472724]
We establish a formal equivalence between the optimization geometry of self-attention and a hard-margin SVM problem. We characterize the implicit bias of 1-layer transformers optimized with gradient descent. We believe these findings inspire the interpretation of transformers as a hierarchy of SVMs that separates and selects optimal tokens.
arXiv Detail & Related papers (2023-08-31T17:57:50Z)
Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems. We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z)
Normalized/Clipped SGD with Perturbation for Differentially Private Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data. We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z)
Convergence of First-Order Methods for Constrained Nonconvex Optimization with Dependent Data [7.513100214864646]
We show the worst-case complexity of convergence $tildeO(t-1/4)$ and MoreautildeO(vareps-4)$ for smooth non- optimization. We obtain first online nonnegative matrix factorization algorithms for dependent data based on projected gradient methods with adaptive step sizes and optimal convergence.
arXiv Detail & Related papers (2022-03-29T17:59:10Z)
DoCoM: Compressed Decentralized Optimization with Near-Optimal Sample Complexity [25.775517797956237]
This paper proposes the Doubly Compressed Momentum-assisted tracking algorithm $ttDoCoM$ for communication. We show that our algorithm outperforms several state-of-the-art algorithms in practice.
arXiv Detail & Related papers (2022-02-01T07:27:34Z)
Stochastic regularized majorization-minimization with weakly convex and multi-convex surrogates [0.0]
We show that the first optimality gap of the proposed algorithm decays at the rate of the expected loss for various methods under nontens dependent data setting. We obtain first convergence point for various methods under nontens dependent data setting.
arXiv Detail & Related papers (2022-01-05T15:17:35Z)
Towards Sample-Optimal Compressive Phase Retrieval with Sparse and Generative Priors [59.33977545294148]
We show that $O(k log L)$ samples suffice to guarantee that the signal is close to any vector that minimizes an amplitude-based empirical loss function. We adapt this result to sparse phase retrieval, and show that $O(s log n)$ samples are sufficient for a similar guarantee when the underlying signal is $s$-sparse and $n$-dimensional.
arXiv Detail & Related papers (2021-06-29T12:49:54Z)
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step. Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z)
Private Stochastic Non-Convex Optimization: Adaptive Algorithms and Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization. We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z)
Non-Adaptive Adaptive Sampling on Turnstile Streams [57.619901304728366]
We give the first relative-error algorithms for column subset selection, subspace approximation, projective clustering, and volume on turnstile streams that use space sublinear in $n$. Our adaptive sampling procedure has a number of applications to various data summarization problems that either improve state-of-the-art or have only been previously studied in the more relaxed row-arrival model.
arXiv Detail & Related papers (2020-04-23T05:00:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.