Related papers: From Local SGD to Local Fixed-Point Methods for Federated Learning

From Local SGD to Local Fixed-Point Methods for Federated Learning

URL: http://arxiv.org/abs/2004.01442v2
Date: Tue, 16 Jun 2020 17:42:08 GMT
Title: From Local SGD to Local Fixed-Point Methods for Federated Learning
Authors: Grigory Malinovsky, Dmitry Kovalev, Elnur Gasanov, Laurent Condat, Peter Richt\'arik
Abstract summary: We consider the generic problem of finding a fixed point of an average of operators, or an approximation thereof, in a distributed setting. We investigate two strategies to achieve such a consensus: one based on a fixed number of local steps, and the other based on randomized computations.
Score: 17.04886864943171
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Most algorithms for solving optimization problems or finding saddle points of convex-concave functions are fixed-point algorithms. In this work we consider the generic problem of finding a fixed point of an average of operators, or an approximation thereof, in a distributed setting. Our work is motivated by the needs of federated learning. In this context, each local operator models the computations done locally on a mobile device. We investigate two strategies to achieve such a consensus: one based on a fixed number of local steps, and the other based on randomized computations. In both cases, the goal is to limit communication of the locally-computed variables, which is often the bottleneck in distributed frameworks. We perform convergence analysis of both methods and conduct a number of experiments highlighting the benefits of our approach.

Related papers

LocalKMeans: Convergence of Lloyd's Algorithm with Distributed Local Iterations [24.939790719971136]
We analyze the classical $K$-means alternating-miniization algorithm, also known as Lloyd's algorithm (Lloyd, 1956)<n>We show that the price paid for the local steps is a higher required signal-to-noise ratio.
arXiv Detail & Related papers (2025-05-23T22:58:40Z)
Communication-Efficient Stochastic Distributed Learning [3.2923780772605595]
We address distributed learning problems, both non and convex, undirected networks. In particular, we design a novel based on the distributed Alternating Method of Multipliers (MM) to address the challenges of high communication costs.
arXiv Detail & Related papers (2025-01-23T10:05:23Z)
ML-based identification of the interface regions for coupling local and nonlocal models [0.0]
Local-nonlocal coupling approaches combine the computational efficiency of local models and the accuracy of nonlocal models. This study introduces a machine learning-based approach to automatically detect the regions in which the local and nonlocal models should be used.
arXiv Detail & Related papers (2024-04-23T14:19:36Z)
A Geometric Approach to $k$-means [17.933546927589685]
We propose a general algorithmic framework for escaping undesirable local solutions. We discuss implementation of these steps, and elucidate how the proposed framework unifies variants of $k$-means algorithm.
arXiv Detail & Related papers (2022-01-13T07:47:50Z)
On Second-order Optimization Methods for Federated Learning [59.787198516188425]
We evaluate the performance of several second-order distributed methods with local steps in the federated learning setting. We propose a novel variant that uses second-order local information for updates and a global line search to counteract the resulting local specificity.
arXiv Detail & Related papers (2021-09-06T12:04:08Z)
Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks [79.16773494166644]
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network. We design two optimal algorithms that attain these lower bounds. We corroborate the theoretical efficiency of these algorithms by performing an experimental comparison with existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-08T15:54:44Z)
Bias-Variance Reduced Local SGD for Less Heterogeneous Federated Learning [46.32232395989181]
We aim at learning local efficiently in terms of communication and computational complexity. One of the important learning scenarios in distributed learning is the Federated Learning scenario.
arXiv Detail & Related papers (2021-02-05T14:32:28Z)
Local SGD: Unified Theory and New Efficient Methods [8.701566919381223]
We present a unified framework for analyzing local SGD methods in the convex and strongly convex regimes. We develop the first linearly converging local SGD method which does not require any data homogeneity or other strong assumptions.
arXiv Detail & Related papers (2020-11-03T13:02:50Z)
Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences. We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline. Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z)
Local Stochastic Approximation: A Unified View of Federated Learning and Distributed Multi-Task Reinforcement Learning Algorithms [1.52292571922932]
We study local approximation over a network of agents, where their goal is to find the root of an operator composed of the local operators at the agents. Our focus is to characterize the finite-time performance of this method when the data at each agent are generated from Markov processes, and hence they are dependent.
arXiv Detail & Related papers (2020-06-24T04:05:11Z)
FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data. To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)
Second-Order Guarantees in Centralized, Federated and Decentralized Nonconvex Optimization [64.26238893241322]
Simple algorithms have been shown to lead to good empirical results in many contexts. Several works have pursued rigorous analytical justification for studying non optimization problems. A key insight in these analyses is that perturbations play a critical role in allowing local descent algorithms.
arXiv Detail & Related papers (2020-03-31T16:54:22Z)
A General Method for Robust Learning from Batches [56.59844655107251]
We consider a general framework of robust learning from batches, and determine the limits of both classification and distribution estimation over arbitrary, including continuous, domains. We derive the first robust computationally-efficient learning algorithms for piecewise-interval classification, and for piecewise-polynomial, monotone, log-concave, and gaussian-mixture distribution estimation.
arXiv Detail & Related papers (2020-02-25T18:53:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.