Decentralized Weakly Convex Optimization Over the Stiefel Manifold
- URL: http://arxiv.org/abs/2303.17779v1
- Date: Fri, 31 Mar 2023 02:56:23 GMT
- Title: Decentralized Weakly Convex Optimization Over the Stiefel Manifold
- Authors: Jinxin Wang, Jiang Hu, Shixiang Chen, Zengde Deng, Anthony Man-Cho So
- Abstract summary: We focus on the Stiefel manifold in the decentralized setting, where a connected network of agents in $nMn log-1)$ are tested.
We propose an method called the decentralized subgradient method (DRSM)$ for forcing a natural station below $nMn log-1)$.
- Score: 28.427697270742947
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We focus on a class of non-smooth optimization problems over the Stiefel
manifold in the decentralized setting, where a connected network of $n$ agents
cooperatively minimize a finite-sum objective function with each component
being weakly convex in the ambient Euclidean space. Such optimization problems,
albeit frequently encountered in applications, are quite challenging due to
their non-smoothness and non-convexity. To tackle them, we propose an iterative
method called the decentralized Riemannian subgradient method (DRSM). The
global convergence and an iteration complexity of $\mathcal{O}(\varepsilon^{-2}
\log^2(\varepsilon^{-1}))$ for forcing a natural stationarity measure below
$\varepsilon$ are established via the powerful tool of proximal smoothness from
variational analysis, which could be of independent interest. Besides, we show
the local linear convergence of the DRSM using geometrically diminishing
stepsizes when the problem at hand further possesses a sharpness property.
Numerical experiments are conducted to corroborate our theoretical findings.
Related papers
- Stochastic First-Order Methods with Non-smooth and Non-Euclidean Proximal Terms for Nonconvex High-Dimensional Stochastic Optimization [2.0657831823662574]
When the non problem is by which the non problem is by whichity, the sample of first-order methods may depend linearly on the problem dimension, is for undesirable problems.
Our algorithms allow for the estimate of complexity using the distance of.
mathO (log d) / EuM4.
We prove that DISFOM can sharpen variance employing $mathO (log d) / EuM4.
arXiv Detail & Related papers (2024-06-27T18:38:42Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Decentralized Riemannian Conjugate Gradient Method on the Stiefel
Manifold [59.73080197971106]
This paper presents a first-order conjugate optimization method that converges faster than the steepest descent method.
It aims to achieve global convergence over the Stiefel manifold.
arXiv Detail & Related papers (2023-08-21T08:02:16Z) - Can Decentralized Stochastic Minimax Optimization Algorithms Converge
Linearly for Finite-Sum Nonconvex-Nonconcave Problems? [56.62372517641597]
Decentralized minimax optimization has been actively studied in the past few years due to its application in a wide range machine learning.
This paper develops two novel decentralized minimax optimization algorithms for the non-strongly-nonconcave problem.
arXiv Detail & Related papers (2023-04-24T02:19:39Z) - Decentralized Riemannian Algorithm for Nonconvex Minimax Problems [82.50374560598493]
The minimax algorithms for neural networks have been developed to solve many problems.
In this paper, we propose two types of minimax algorithms.
For the setting, we propose DRSGDA and prove that our method achieves a gradient.
arXiv Detail & Related papers (2023-02-08T01:42:45Z) - Nonsmooth Nonconvex-Nonconcave Minimax Optimization: Primal-Dual
Balancing and Iteration Complexity Analysis [28.575516056239575]
We introduce a novel analysis for PLDA, the key are our newly developed nonsmooth and dual error iterations.
When $thetain [0,12]$, PLDA achieves the optimal $mathcalO()$ under mild assumptions.
arXiv Detail & Related papers (2022-09-22T07:12:48Z) - Gradient-Free Methods for Deterministic and Stochastic Nonsmooth
Nonconvex Optimization [94.19177623349947]
Non-smooth non optimization problems emerge in machine learning and business making.
Two core challenges impede the development of efficient methods with finitetime convergence guarantee.
Two-phase versions of GFM and SGFM are also proposed and proven to achieve improved large-deviation results.
arXiv Detail & Related papers (2022-09-12T06:53:24Z) - Randomized Coordinate Subgradient Method for Nonsmooth Composite
Optimization [11.017632675093628]
Coordinate-type subgradient methods for addressing nonsmooth problems are relatively underexplored due to the set of properties of the Lipschitz-type assumption.
arXiv Detail & Related papers (2022-06-30T02:17:11Z) - Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z) - Stochastic Zeroth-order Riemannian Derivative Estimation and
Optimization [15.78743548731191]
We propose an oracle version of the Gaussian smoothing function to overcome the difficulty of non-linearity of manifold non-linearity.
We demonstrate the applicability of our algorithms by results and real-world applications on black-box stiffness control for robotics and black-box attacks to neural networks.
arXiv Detail & Related papers (2020-03-25T06:58:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.