Global Convergence of Iteratively Reweighted Least Squares for Robust Subspace Recovery
- URL: http://arxiv.org/abs/2506.20533v2
- Date: Sun, 29 Jun 2025 13:53:45 GMT
- Title: Global Convergence of Iteratively Reweighted Least Squares for Robust Subspace Recovery
- Authors: Gilad Lerman, Kang Li, Tyler Maunu, Teng Zhang,
- Abstract summary: Iteratively Reweighted Least Squares (IRLS) is an elegant and empirically effective approach to subspace estimation.<n>This paper establishes that, under deterministic conditions, a variant IRLS with dynamic regularization converges linearly to the underlying subspace.<n>We extend these guarantees to subspace estimation, a setting that lacks prior recovery theory.
- Score: 19.37238379592233
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robust subspace estimation is fundamental to many machine learning and data analysis tasks. Iteratively Reweighted Least Squares (IRLS) is an elegant and empirically effective approach to this problem, yet its theoretical properties remain poorly understood. This paper establishes that, under deterministic conditions, a variant of IRLS with dynamic smoothing regularization converges linearly to the underlying subspace from any initialization. We extend these guarantees to affine subspace estimation, a setting that lacks prior recovery theory. Additionally, we illustrate the practical benefits of IRLS through an application to low-dimensional neural network training. Our results provide the first global convergence guarantees for IRLS in robust subspace recovery and, more broadly, for nonconvex IRLS on a Riemannian manifold.
Related papers
- Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality [52.906438147288256]
We show that our algorithm can identify the globally optimal reward and policy under certain neural network structures.<n>This is the first IRL algorithm with a non-asymptotic convergence guarantee that provably achieves global optimality.
arXiv Detail & Related papers (2025-03-22T21:16:08Z) - An Empirical Risk Minimization Approach for Offline Inverse RL and Dynamic Discrete Choice Model [9.531082746970286]
We study the problem of estimating Dynamic Choice (DDC) models, also known as offline Maximum Entropy-Regularized Inverse Reinforcement Learning ( offline MaxEnt-IRL) in machine learning.<n>The objective is to recover reward or $Q*$ functions that govern agent behavior from offline behavior data.<n>We propose a globally convergent gradient-based method for solving these problems without the restrictive assumption of linearly parameterized rewards.
arXiv Detail & Related papers (2025-02-19T22:22:20Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Stability and Generalization Analysis of Gradient Methods for Shallow
Neural Networks [59.142826407441106]
We study the generalization behavior of shallow neural networks (SNNs) by leveraging the concept of algorithmic stability.
We consider gradient descent (GD) and gradient descent (SGD) to train SNNs, for both of which we develop consistent excess bounds.
arXiv Detail & Related papers (2022-09-19T18:48:00Z) - Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient
for Out-of-Distribution Generalization [52.7137956951533]
We argue that devising simpler methods for learning predictors on existing features is a promising direction for future research.
We introduce Domain-Adjusted Regression (DARE), a convex objective for learning a linear predictor that is provably robust under a new model of distribution shift.
Under a natural model, we prove that the DARE solution is the minimax-optimal predictor for a constrained set of test distributions.
arXiv Detail & Related papers (2022-02-14T16:42:16Z) - Distributionally Robust Fair Principal Components via Geodesic Descents [16.440434996206623]
In consequential domains such as college admission, healthcare and credit approval, it is imperative to take into account emerging criteria such as the fairness and the robustness of the learned projection.
We propose a distributionally robust optimization problem for principal component analysis which internalizes a fairness criterion in the objective function.
Our experimental results on real-world datasets show the merits of our proposed method over state-of-the-art baselines.
arXiv Detail & Related papers (2022-02-07T11:08:13Z) - Implicit Bias of Projected Subgradient Method Gives Provable Robust
Recovery of Subspaces of Unknown Codimension [12.354076490479514]
We show that Dual Principal Component Pursuit (DPCP) can provably solve problems in the it unknown subspace dimension regime.
We propose a very simple algorithm based on running multiple instances of a projected sub-gradient descent method (PSGM)
In particular, we show that 1) all of the problem instances will converge to a vector in the nullspace of the subspace and 2) the ensemble of problem instance solutions will be sufficiently diverse to fully span the nullspace of the subspace.
arXiv Detail & Related papers (2022-01-22T15:36:03Z) - Iteratively Reweighted Least Squares for $\ell_1$-minimization with
Global Linear Convergence Rate [0.0]
Iteratively Reweighted Least Squares (IRLS) represents an important family of algorithms for non-smooth optimization.
We prove that IRLS for $ell_$-minimization converges to a sparse solution with a global linear rate.
arXiv Detail & Related papers (2020-12-22T18:47:22Z) - Globally-convergent Iteratively Reweighted Least Squares for Robust
Regression Problems [15.823258699608994]
We provide the first global model recovery results for the IRLS (iteratively reweighted least squares) for robust regression problems.
We propose augmentations to the basic IRLS routine that not only offer guaranteed global recovery, but in practice also outperform state-of-the-art algorithms for robust regression.
arXiv Detail & Related papers (2020-06-25T07:16:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.