A Computation and Communication Efficient Method for Distributed
Nonconvex Problems in the Partial Participation Setting
- URL: http://arxiv.org/abs/2205.15580v4
- Date: Wed, 3 Jan 2024 14:21:38 GMT
- Title: A Computation and Communication Efficient Method for Distributed
Nonconvex Problems in the Partial Participation Setting
- Authors: Alexander Tyurin, Peter Richt\'arik
- Abstract summary: We present a new method that includes three key components: variance reduction, partial participation, and compressed communication.
We prove that the new method has optimal oracle complexity and state-of-the-art communication complexity in the partial participation setting.
- Score: 58.59873548589766
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a new method that includes three key components of distributed
optimization and federated learning: variance reduction of stochastic
gradients, partial participation, and compressed communication. We prove that
the new method has optimal oracle complexity and state-of-the-art communication
complexity in the partial participation setting. Regardless of the
communication compression feature, our method successfully combines variance
reduction and partial participation: we get the optimal oracle complexity,
never need the participation of all nodes, and do not require the bounded
gradients (dissimilarity) assumption.
Related papers
- Differential error feedback for communication-efficient decentralized learning [48.924131251745266]
We propose a new decentralized communication-efficient learning approach that blends differential quantization with error feedback.
We show that the resulting communication-efficient strategy is stable both in terms of mean-square error and average bit rate.
The results establish that, in the small step-size regime and with a finite number of bits, it is possible to attain the performance achievable in the absence of compression.
arXiv Detail & Related papers (2024-06-26T15:11:26Z) - Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving [73.58829980121767]
We present a novel method for solving square jigsaw puzzles based on global optimization.
The method is fully automatic, assumes no prior information, and can handle puzzles with known or unknown piece orientation.
arXiv Detail & Related papers (2023-03-26T18:53:51Z) - TAMUNA: Doubly Accelerated Distributed Optimization with Local Training, Compression, and Partial Participation [53.84175614198885]
In distributed optimization and learning, several machines alternate between local computations in parallel and communication with a distant server.
We propose TAMUNA, the first algorithm for distributed optimization that leveraged the two strategies of local training and compression jointly and allows for partial participation.
arXiv Detail & Related papers (2023-02-20T08:37:44Z) - Optimal Algorithms for Stochastic Complementary Composite Minimization [55.26935605535377]
Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization.
We provide novel excess risk bounds, both in expectation and with high probability.
Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems.
arXiv Detail & Related papers (2022-11-03T12:40:24Z) - Distributed Newton-Type Methods with Communication Compression and
Bernoulli Aggregation [11.870393751095083]
We study ommunication compression and aggregation mechanisms for curvature information.
New 3PC mechanisms, such as adaptive thresholding and Bernoulli aggregation, require reduced communication and occasional Hessian computations.
For all our methods, we derive fast condition-number-independent local linear and/or superlinear convergence rates.
arXiv Detail & Related papers (2022-06-07T21:12:21Z) - Contextual Model Aggregation for Fast and Robust Federated Learning in
Edge Computing [88.76112371510999]
Federated learning is a prime candidate for distributed machine learning at the network edge.
Existing algorithms face issues with slow convergence and/or robustness of performance.
We propose a contextual aggregation scheme that achieves the optimal context-dependent bound on loss reduction.
arXiv Detail & Related papers (2022-03-23T21:42:31Z) - Bidirectional compression in heterogeneous settings for distributed or
federated learning with partial participation: tight convergence guarantees [9.31522898261934]
Artemis is a framework to tackle the problem of learning in a distributed setting with communication constraints and device partial participation.
It improves on existing algorithms that only consider unidirectional compression (to the server), or use very strong assumptions on the compression operator, and often do not take into account devices partial participation.
arXiv Detail & Related papers (2020-06-25T17:37:45Z) - Federated Learning of a Mixture of Global and Local Models [10.279748604797911]
We propose a new optimization formulation for training federated learning models.
We show that local steps can improve communication for problems with heterogeneous data.
In particular, we are the first to i) show that local steps can improve communication for problems with heterogeneous data, and ii) point out that personalization yields reduced communication complexity.
arXiv Detail & Related papers (2020-02-10T09:17:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.