Revisiting Optimal Convergence Rate for Smooth and Non-convex Stochastic
Decentralized Optimization
- URL: http://arxiv.org/abs/2210.07863v1
- Date: Fri, 14 Oct 2022 14:34:32 GMT
- Title: Revisiting Optimal Convergence Rate for Smooth and Non-convex Stochastic
Decentralized Optimization
- Authors: Kun Yuan, Xinmeng Huang, Yiming Chen, Xiaohan Zhang, Yingya Zhang, Pan
Pan
- Abstract summary: Decentralized optimization is effective to save communication in large-scale machine learning.
This paper develops an optimal convergence algorithm with general weight matrices.
- Score: 25.831902182404388
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Decentralized optimization is effective to save communication in large-scale
machine learning. Although numerous algorithms have been proposed with
theoretical guarantees and empirical successes, the performance limits in
decentralized optimization, especially the influence of network topology and
its associated weight matrix on the optimal convergence rate, have not been
fully understood. While (Lu and Sa, 2021) have recently provided an optimal
rate for non-convex stochastic decentralized optimization with weight matrices
defined over linear graphs, the optimal rate with general weight matrices
remains unclear.
This paper revisits non-convex stochastic decentralized optimization and
establishes an optimal convergence rate with general weight matrices. In
addition, we also establish the optimal rate when non-convex loss functions
further satisfy the Polyak-Lojasiewicz (PL) condition. Following existing lines
of analysis in literature cannot achieve these results. Instead, we leverage
the Ring-Lattice graph to admit general weight matrices while maintaining the
optimal relation between the graph diameter and weight matrix connectivity.
Lastly, we develop a new decentralized algorithm to nearly attain the above two
optimal rates under additional mild conditions.
Related papers
- Composite Optimization Algorithms for Sigmoid Networks [3.160070867400839]
We propose the composite optimization algorithms based on the linearized proximal algorithms and the alternating direction of multipliers.
Numerical experiments on Frank's function fitting show that the proposed algorithms perform satisfactorily robustly.
arXiv Detail & Related papers (2023-03-01T15:30:29Z) - Decentralized Inexact Proximal Gradient Method With Network-Independent
Stepsizes for Convex Composite Optimization [39.352542703876104]
This paper considers decentralized convex composite optimization over undirected and connected networks.
A novel CTA (Combine-Then-Adapt)-based decentralized algorithm is proposed under uncoordinated network-independent constant stepsizes.
arXiv Detail & Related papers (2023-02-07T03:50:38Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - Non-Convex Optimization with Certificates and Fast Rates Through Kernel
Sums of Squares [68.8204255655161]
We consider potentially non- optimized approximation problems.
In this paper, we propose an algorithm that achieves close to optimal a priori computational guarantees.
arXiv Detail & Related papers (2022-04-11T09:37:04Z) - Generalization Properties of Stochastic Optimizers via Trajectory
Analysis [48.38493838310503]
We show that both the Fernique-Talagrand functional and the local powerlaw are predictive of generalization performance.
We show that both our Fernique-Talagrand functional and the local powerlaw are predictive of generalization performance.
arXiv Detail & Related papers (2021-08-02T10:58:32Z) - Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - Stochastic Coordinate Minimization with Progressive Precision for
Stochastic Convex Optimization [16.0251555430107]
A framework based on iterative coordinate minimization (CM) is developed for convex optimization.
We establish the optimal precision control and the resulting order-optimal regret performance.
The proposed algorithm is amenable to online implementation and inherits the scalability and parallelizability properties of CM for large-scale optimization.
arXiv Detail & Related papers (2020-03-11T18:42:40Z) - Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization [71.03797261151605]
Adaptivity is an important yet under-studied property in modern optimization theory.
Our algorithm is proved to achieve the best-available convergence for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.
arXiv Detail & Related papers (2020-02-13T05:42:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.