Related papers: A Computation and Communication Efficient Method for Distributed Nonconvex Problems in the Partial Participation Setting

A Computation and Communication Efficient Method for Distributed Nonconvex Problems in the Partial Participation Setting

URL: http://arxiv.org/abs/2205.15580v4
Date: Wed, 3 Jan 2024 14:21:38 GMT
Title: A Computation and Communication Efficient Method for Distributed Nonconvex Problems in the Partial Participation Setting
Authors: Alexander Tyurin, Peter Richt\'arik
Abstract summary: We present a new method that includes three key components: variance reduction, partial participation, and compressed communication. We prove that the new method has optimal oracle complexity and state-of-the-art communication complexity in the partial participation setting.
Score: 58.59873548589766
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a new method that includes three key components of distributed optimization and federated learning: variance reduction of stochastic gradients, partial participation, and compressed communication. We prove that the new method has optimal oracle complexity and state-of-the-art communication complexity in the partial participation setting. Regardless of the communication compression feature, our method successfully combines variance reduction and partial participation: we get the optimal oracle complexity, never need the participation of all nodes, and do not require the bounded gradients (dissimilarity) assumption.

Related papers

Learning Partial Graph Matching via Optimal Partial Transport [2.4378101048225735]
We propose a novel framework for partial graph matching inspired by optimal partial transport. Our approach formulates an objective that enables partial assignments while incorporating matching biases. Our method can achieve efficient, exact solutions within cubic worst case time complexity.
arXiv Detail & Related papers (2024-10-22T05:56:57Z)
Differential error feedback for communication-efficient decentralized learning [48.924131251745266]
We propose a new decentralized communication-efficient learning approach that blends differential quantization with error feedback. We show that the resulting communication-efficient strategy is stable both in terms of mean-square error and average bit rate. The results establish that, in the small step-size regime and with a finite number of bits, it is possible to attain the performance achievable in the absence of compression.
arXiv Detail & Related papers (2024-06-26T15:11:26Z)
Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving [73.58829980121767]
We present a novel method for solving square jigsaw puzzles based on global optimization. The method is fully automatic, assumes no prior information, and can handle puzzles with known or unknown piece orientation.
arXiv Detail & Related papers (2023-03-26T18:53:51Z)
TAMUNA: Doubly Accelerated Distributed Optimization with Local Training, Compression, and Partial Participation [53.84175614198885]
In distributed optimization and learning, several machines alternate between local computations in parallel and communication with a distant server. We propose TAMUNA, the first algorithm for distributed optimization that leveraged the two strategies of local training and compression jointly and allows for partial participation.
arXiv Detail & Related papers (2023-02-20T08:37:44Z)
Optimal Algorithms for Stochastic Complementary Composite Minimization [55.26935605535377]
Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization. We provide novel excess risk bounds, both in expectation and with high probability. Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems.
arXiv Detail & Related papers (2022-11-03T12:40:24Z)
Distributed Newton-Type Methods with Communication Compression and Bernoulli Aggregation [11.870393751095083]
We study ommunication compression and aggregation mechanisms for curvature information. New 3PC mechanisms, such as adaptive thresholding and Bernoulli aggregation, require reduced communication and occasional Hessian computations. For all our methods, we derive fast condition-number-independent local linear and/or superlinear convergence rates.
arXiv Detail & Related papers (2022-06-07T21:12:21Z)
Contextual Model Aggregation for Fast and Robust Federated Learning in Edge Computing [88.76112371510999]
Federated learning is a prime candidate for distributed machine learning at the network edge. Existing algorithms face issues with slow convergence and/or robustness of performance. We propose a contextual aggregation scheme that achieves the optimal context-dependent bound on loss reduction.
arXiv Detail & Related papers (2022-03-23T21:42:31Z)
Bidirectional compression in heterogeneous settings for distributed or federated learning with partial participation: tight convergence guarantees [9.31522898261934]
Artemis is a framework to tackle the problem of learning in a distributed setting with communication constraints and device partial participation. It improves on existing algorithms that only consider unidirectional compression (to the server), or use very strong assumptions on the compression operator, and often do not take into account devices partial participation.
arXiv Detail & Related papers (2020-06-25T17:37:45Z)
Federated Learning of a Mixture of Global and Local Models [10.279748604797911]
We propose a new optimization formulation for training federated learning models. We show that local steps can improve communication for problems with heterogeneous data. In particular, we are the first to i) show that local steps can improve communication for problems with heterogeneous data, and ii) point out that personalization yields reduced communication complexity.
arXiv Detail & Related papers (2020-02-10T09:17:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.