The Minimax Complexity of Distributed Optimization
- URL: http://arxiv.org/abs/2109.00534v1
- Date: Wed, 1 Sep 2021 15:18:33 GMT
- Title: The Minimax Complexity of Distributed Optimization
- Authors: Blake Woodworth
- Abstract summary: I present the "graph oracle model", an extension of the classic oracle framework that can be applied to distributed optimization.
I focus on the specific case of the "intermittent communication setting"
I analyze the theoretical properties of the popular Local Descent (SGD) algorithm in convex setting.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this thesis, I study the minimax oracle complexity of distributed
stochastic optimization. First, I present the "graph oracle model", an
extension of the classic oracle complexity framework that can be applied to
study distributed optimization algorithms. Next, I describe a general approach
to proving optimization lower bounds for arbitrary randomized algorithms (as
opposed to more restricted classes of algorithms, e.g., deterministic or
"zero-respecting" algorithms), which is used extensively throughout the thesis.
For the remainder of the thesis, I focus on the specific case of the
"intermittent communication setting", where multiple computing devices work in
parallel with limited communication amongst themselves. In this setting, I
analyze the theoretical properties of the popular Local Stochastic Gradient
Descent (SGD) algorithm in convex setting, both for homogeneous and
heterogeneous objectives. I provide the first guarantees for Local SGD that
improve over simple baseline methods, but show that Local SGD is not optimal in
general. In pursuit of optimal methods in the intermittent communication
setting, I then show matching upper and lower bounds for the intermittent
communication setting with homogeneous convex, heterogeneous convex, and
homogeneous non-convex objectives. These upper bounds are attained by simple
variants of SGD which are therefore optimal. Finally, I discuss several
additional assumptions about the objective or more powerful oracles that might
be exploitable in order to develop better intermittent communication algorithms
with better guarantees than our lower bounds allow.
Related papers
- Sparsity-Constraint Optimization via Splicing Iteration [1.3622424109977902]
We develop an algorithm named Sparsity-Constraint Optimization via sPlicing itEration (SCOPE)
SCOPE converges effectively without tuning parameters.
We apply SCOPE to solve quadratic optimization, learn sparse classifiers, and recover sparse Markov networks for binary variables.
Our open-source Python package skscope based on C++ implementation is publicly available on GitHub.
arXiv Detail & Related papers (2024-06-17T18:34:51Z) - Federated Minimax Optimization: Improved Convergence Analyses and
Algorithms [32.062312674333775]
We consider non minimax optimization, is gaining prominence many modern machine learning applications such as GANs.
We provide a novel and tighter analysis algorithm, improves convergence communication guarantees in the existing literature.
arXiv Detail & Related papers (2022-03-09T16:21:31Z) - Escaping Saddle Points with Bias-Variance Reduced Local Perturbed SGD
for Communication Efficient Nonconvex Distributed Learning [58.79085525115987]
Local methods are one of the promising approaches to reduce communication time.
We show that the communication complexity is better than non-local methods when the local datasets is smaller than the smoothness local loss.
arXiv Detail & Related papers (2022-02-12T15:12:17Z) - Distributed stochastic proximal algorithm with random reshuffling for
non-smooth finite-sum optimization [28.862321453597918]
Non-smooth finite-sum minimization is a fundamental problem in machine learning.
This paper develops a distributed proximal-gradient algorithm with random reshuffling to solve the problem.
arXiv Detail & Related papers (2021-11-06T07:29:55Z) - Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex
Decentralized Optimization Over Time-Varying Networks [79.16773494166644]
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network.
We design two optimal algorithms that attain these lower bounds.
We corroborate the theoretical efficiency of these algorithms by performing an experimental comparison with existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-08T15:54:44Z) - Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z) - Efficient Methods for Structured Nonconvex-Nonconcave Min-Max
Optimization [98.0595480384208]
We propose a generalization extraient spaces which converges to a stationary point.
The algorithm applies not only to general $p$-normed spaces, but also to general $p$-dimensional vector spaces.
arXiv Detail & Related papers (2020-10-31T21:35:42Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.