Second-Order Guarantees in Centralized, Federated and Decentralized
Nonconvex Optimization
- URL: http://arxiv.org/abs/2003.14366v1
- Date: Tue, 31 Mar 2020 16:54:22 GMT
- Title: Second-Order Guarantees in Centralized, Federated and Decentralized
Nonconvex Optimization
- Authors: Stefan Vlaski and Ali H. Sayed
- Abstract summary: Simple algorithms have been shown to lead to good empirical results in many contexts.
Several works have pursued rigorous analytical justification for studying non optimization problems.
A key insight in these analyses is that perturbations play a critical role in allowing local descent algorithms.
- Score: 64.26238893241322
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Rapid advances in data collection and processing capabilities have allowed
for the use of increasingly complex models that give rise to nonconvex
optimization problems. These formulations, however, can be arbitrarily
difficult to solve in general, in the sense that even simply verifying that a
given point is a local minimum can be NP-hard [1]. Still, some relatively
simple algorithms have been shown to lead to surprisingly good empirical
results in many contexts of interest. Perhaps the most prominent example is the
success of the backpropagation algorithm for training neural networks. Several
recent works have pursued rigorous analytical justification for this phenomenon
by studying the structure of the nonconvex optimization problems and
establishing that simple algorithms, such as gradient descent and its
variations, perform well in converging towards local minima and avoiding
saddle-points. A key insight in these analyses is that gradient perturbations
play a critical role in allowing local descent algorithms to efficiently
distinguish desirable from undesirable stationary points and escape from the
latter. In this article, we cover recent results on second-order guarantees for
stochastic first-order optimization algorithms in centralized, federated, and
decentralized architectures.
Related papers
- Decentralized Sum-of-Nonconvex Optimization [42.04181488477227]
We consider the optimization problem of the sum-of-non function, i.e., a guarantee function that is the average non-consensus number.
We propose an accelerated decentralized first-order algorithm by techniques of gradient, tracking into the rate, and multi-consensus.
arXiv Detail & Related papers (2024-02-04T05:48:45Z) - Optimizing NOTEARS Objectives via Topological Swaps [41.18829644248979]
In this paper, we develop an effective method for a set of candidate algorithms.
At the inner level, given a subject, we utilize off-the-the-art constraints.
The proposed method significantly improves the score of other algorithms.
arXiv Detail & Related papers (2023-05-26T21:49:37Z) - Can Decentralized Stochastic Minimax Optimization Algorithms Converge
Linearly for Finite-Sum Nonconvex-Nonconcave Problems? [56.62372517641597]
Decentralized minimax optimization has been actively studied in the past few years due to its application in a wide range machine learning.
This paper develops two novel decentralized minimax optimization algorithms for the non-strongly-nonconcave problem.
arXiv Detail & Related papers (2023-04-24T02:19:39Z) - Efficiently Escaping Saddle Points in Bilevel Optimization [48.925688192913]
Bilevel optimization is one of the problems in machine learning.
Recent developments in bilevel optimization converge on the first fundamental nonaptature multi-step analysis.
arXiv Detail & Related papers (2022-02-08T07:10:06Z) - Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex
Decentralized Optimization Over Time-Varying Networks [79.16773494166644]
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network.
We design two optimal algorithms that attain these lower bounds.
We corroborate the theoretical efficiency of these algorithms by performing an experimental comparison with existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-08T15:54:44Z) - Escaping Poor Local Minima in Large Scale Robust Estimation [41.304283715031204]
We introduce two novel approaches for robust parameter estimation.
The first algorithm uses an adaptive kernel scaling strategy that enjoys a strong ability to escape poor minima.
The second algorithm combines a generalized Majorization Minimization framework with the half-quadratic lifting formulation to obtain a simple yet efficient solver.
arXiv Detail & Related papers (2021-02-22T11:58:29Z) - Byzantine-Resilient Non-Convex Stochastic Gradient Descent [61.6382287971982]
adversary-resilient distributed optimization, in which.
machines can independently compute gradients, and cooperate.
Our algorithm is based on a new concentration technique, and its sample complexity.
It is very practical: it improves upon the performance of all prior methods when no.
setting machines are present.
arXiv Detail & Related papers (2020-12-28T17:19:32Z) - Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.