Orthogonal Directions Constrained Gradient Method: from non-linear
equality constraints to Stiefel manifold
- URL: http://arxiv.org/abs/2303.09261v1
- Date: Thu, 16 Mar 2023 12:25:53 GMT
- Title: Orthogonal Directions Constrained Gradient Method: from non-linear
equality constraints to Stiefel manifold
- Authors: Sholom Schechtman, Daniil Tiapkin, Michael Muehlebach, Eric Moulines
- Abstract summary: We propose a novel algorithm, the Orthogonal Directions Constrained Method (ODCGM)
ODCGM only requires computing a projection onto a vector space.
We show that ODCGM exhibits the near-optimal oracle complexities.
- Score: 16.099883128428054
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of minimizing a non-convex function over a smooth
manifold $\mathcal{M}$. We propose a novel algorithm, the Orthogonal Directions
Constrained Gradient Method (ODCGM) which only requires computing a projection
onto a vector space. ODCGM is infeasible but the iterates are constantly pulled
towards the manifold, ensuring the convergence of ODCGM towards $\mathcal{M}$.
ODCGM is much simpler to implement than the classical methods which require the
computation of a retraction. Moreover, we show that ODCGM exhibits the
near-optimal oracle complexities $\mathcal{O}(1/\varepsilon^2)$ and
$\mathcal{O}(1/\varepsilon^4)$ in the deterministic and stochastic cases,
respectively. Furthermore, we establish that, under an appropriate choice of
the projection metric, our method recovers the landing algorithm of Ablin and
Peyr\'e (2022), a recently introduced algorithm for optimization over the
Stiefel manifold. As a result, we significantly extend the analysis of Ablin
and Peyr\'e (2022), establishing near-optimal rates both in deterministic and
stochastic frameworks. Finally, we perform numerical experiments which shows
the efficiency of ODCGM in a high-dimensional setting.
Related papers
- A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval [56.67706781191521]
In this work, we present a robust phase retrieval problem where the task is to recover an unknown signal.
Our proposed oracle avoids the need for computationally spectral descent, using a simple gradient step and outliers.
arXiv Detail & Related papers (2024-09-07T06:37:23Z) - Riemannian Bilevel Optimization [35.42472057648458]
We focus in particular on batch and gradient-based methods, with the explicit goal of avoiding second-order information.
We propose and analyze $mathrmRF2SA$, a method that leverages first-order gradient information.
We provide explicit convergence rates for reaching $epsilon$-stationary points under various setups.
arXiv Detail & Related papers (2024-05-22T20:49:01Z) - Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue
Optimization Problems [76.2042837251496]
We introduce two oblivious mirror descent algorithms based on a complementary composite setting.
Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function.
We show how to extend our framework to scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
arXiv Detail & Related papers (2023-06-30T08:34:29Z) - Low-complexity subspace-descent over symmetric positive definite
manifold [9.346050098365648]
We develop low-complexity algorithms for the minimization of functions over the symmetric positive definite (SPD) manifold.
The proposed approach utilizes carefully chosen subspaces that allow the update to be written as a product of the Cholesky factor of the iterate and a sparse matrix.
arXiv Detail & Related papers (2023-05-03T11:11:46Z) - Infeasible Deterministic, Stochastic, and Variance-Reduction Algorithms for Optimization under Orthogonality Constraints [9.301728976515255]
This article provides new practical and theoretical developments for the landing algorithm.
First, the method is extended to the Stiefel manifold.
We also consider variance reduction algorithms when the cost function is an average of many functions.
arXiv Detail & Related papers (2023-03-29T07:36:54Z) - Reducing the Variance of Gaussian Process Hyperparameter Optimization
with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication.
We prove that preconditioning has an additional benefit that has been previously unexplored.
It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z) - Correcting Momentum with Second-order Information [50.992629498861724]
We develop a new algorithm for non-critical optimization that finds an $O(epsilon)$epsilon point in the optimal product.
We validate our results on a variety of large-scale deep learning benchmarks and architectures.
arXiv Detail & Related papers (2021-03-04T19:01:20Z) - Block majorization-minimization with diminishing radius for constrained
nonconvex optimization [9.907540661545328]
Block tensor regularization-minimization (BMM) is a simple iterative algorithm for non constrained optimization that minimizes major surrogates in each block.
We show that BMM can produce a gradient $O(epsilon-2(logepsilon-1)2)$ when convex surrogates are used.
arXiv Detail & Related papers (2020-12-07T07:53:09Z) - Gradient Free Minimax Optimization: Variance Reduction and Faster
Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem.
We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z) - Efficient algorithms for multivariate shape-constrained convex
regression problems [9.281671380673306]
We prove that the least squares estimator is computable via solving a constrained convex programming (QP) problem with $(n+1)d$ variables and at least $n(n-1)$ linear inequality constraints.
For solving the generally very large-scale convex QP, we design two efficient algorithms, one is the symmetric Gauss-Seidel based alternating direction method of multipliers (tt sGS-ADMM), and the other is the proximal augmented Lagrangian method (tt pALM) with the subproblems solved by the semismooth Newton method (t
arXiv Detail & Related papers (2020-02-26T11:18:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.