Global Convergence of Decentralized Retraction-Free Optimization on the Stiefel Manifold
- URL: http://arxiv.org/abs/2405.11590v1
- Date: Sun, 19 May 2024 15:50:57 GMT
- Title: Global Convergence of Decentralized Retraction-Free Optimization on the Stiefel Manifold
- Authors: Youbang Sun, Shixiang Chen, Alfredo Garcia, Shahin Shahrampour,
- Abstract summary: We show that DRFGT performs retraction on a gradient based on the corresponding DRFGT method.
Also show that DRFGT can be used to perform retraction on a network of agents.
- Score: 12.414718831844041
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many classical and modern machine learning algorithms require solving optimization tasks under orthogonal constraints. Solving these tasks often require calculating retraction-based gradient descent updates on the corresponding Riemannian manifold, which can be computationally expensive. Recently Ablin et al. proposed an infeasible retraction-free algorithm, which is significantly more efficient. In this paper, we study the decentralized non-convex optimization task over a network of agents on the Stiefel manifold with retraction-free updates. We propose \textbf{D}ecentralized \textbf{R}etraction-\textbf{F}ree \textbf{G}radient \textbf{T}racking (DRFGT) algorithm, and show that DRFGT exhibits ergodic $\mathcal{O}(1/K)$ convergence rate, the same rate of convergence as the centralized, retraction-based methods. We also provide numerical experiments demonstrating that DRFGT performs on par with the state-of-the-art retraction based methods with substantially reduced computational overhead.
Related papers
- Deep Inertia $L_p$ Half-Quadratic Splitting Unrolling Network for Sparse View CT Reconstruction [20.632166806596278]
Sparse view computed tomography (CT) reconstruction poses a challenging ill-posed inverse problem, necessitating effective regularization techniques.
We employ $L_p$-norm regularization to induce sparsity and introduce inertial steps, leading to the development of the inertial $L_p$-norm half-quadratic splitting algorithm.
Our proposed algorithm surpasses existing methods, particularly excelling in fewer scanned views and complex noise conditions.
arXiv Detail & Related papers (2024-08-13T03:32:59Z) - Decentralized Sum-of-Nonconvex Optimization [42.04181488477227]
We consider the optimization problem of the sum-of-non function, i.e., a guarantee function that is the average non-consensus number.
We propose an accelerated decentralized first-order algorithm by techniques of gradient, tracking into the rate, and multi-consensus.
arXiv Detail & Related papers (2024-02-04T05:48:45Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - Low-Rank Tensor Completion via Novel Sparsity-Inducing Regularizers [30.920908325825668]
To alleviate l1-norm in the low-rank tensor completion problem, non-rank surrogates/regularizers have been suggested.
These regularizers are applied to nuclear-rank restoration, and efficient algorithms based on the method of multipliers are proposed.
arXiv Detail & Related papers (2023-10-10T01:00:13Z) - An Optimization-based Deep Equilibrium Model for Hyperspectral Image
Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem.
A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network.
The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - Can Decentralized Stochastic Minimax Optimization Algorithms Converge
Linearly for Finite-Sum Nonconvex-Nonconcave Problems? [56.62372517641597]
Decentralized minimax optimization has been actively studied in the past few years due to its application in a wide range machine learning.
This paper develops two novel decentralized minimax optimization algorithms for the non-strongly-nonconcave problem.
arXiv Detail & Related papers (2023-04-24T02:19:39Z) - A Variance-Reduced Stochastic Gradient Tracking Algorithm for
Decentralized Optimization with Orthogonality Constraints [7.028225540638832]
We propose a novel algorithm for decentralized optimization with orthogonality constraints.
VRSGT is the first algorithm for decentralized optimization with orthogonality constraints that reduces both sampling and communication complexities simultaneously.
In the numerical experiments, VRGTS has a promising performance in a real-world autonomous sample.
arXiv Detail & Related papers (2022-08-29T14:46:44Z) - Fully-Connected Tensor Network Decomposition for Robust Tensor
Completion Problem [9.580645211308557]
We propose a $textbfFCTN$-based $textbfr$obust $textbfc$onvex optimization model (RC-FCTN) for the RTC problem.
For solving the constrained optimization model RC-FCTN, we develop an alternating direction method of multipliers (ADMM)-based algorithm.
A proximal alternating minimization (PAM)-based algorithm is developed to solve the proposed RNC-FCTN.
arXiv Detail & Related papers (2021-10-17T08:12:50Z) - Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex
Decentralized Optimization Over Time-Varying Networks [79.16773494166644]
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network.
We design two optimal algorithms that attain these lower bounds.
We corroborate the theoretical efficiency of these algorithms by performing an experimental comparison with existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-08T15:54:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.