Quasi-Global Momentum: Accelerating Decentralized Deep Learning on
Heterogeneous Data
- URL: http://arxiv.org/abs/2102.04761v1
- Date: Tue, 9 Feb 2021 11:27:14 GMT
- Title: Quasi-Global Momentum: Accelerating Decentralized Deep Learning on
Heterogeneous Data
- Authors: Tao Lin, Sai Praneeth Karimireddy, Sebastian U. Stich, Martin Jaggi
- Abstract summary: Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks.
In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge.
We propose a novel momentum-based method to mitigate this decentralized training difficulty.
- Score: 77.88594632644347
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decentralized training of deep learning models is a key element for enabling
data privacy and on-device learning over networks. In realistic learning
scenarios, the presence of heterogeneity across different clients' local
datasets poses an optimization challenge and may severely deteriorate the
generalization performance.
In this paper, we investigate and identify the limitation of several
decentralized optimization algorithms for different degrees of data
heterogeneity. We propose a novel momentum-based method to mitigate this
decentralized training difficulty. We show in extensive empirical experiments
on various CV/NLP datasets (CIFAR-10, ImageNet, AG News, and SST2) and several
network topologies (Ring and Social Network) that our method is much more
robust to the heterogeneity of clients' data than other existing methods, by a
significant improvement in test performance ($1\% \!-\! 20\%$).
Related papers
- NTK-DFL: Enhancing Decentralized Federated Learning in Heterogeneous Settings via Neural Tangent Kernel [27.92271597111756]
Decentralized federated learning (DFL) is a collaborative machine learning framework for training a model across participants without a central server or raw data exchange.
Recent work has shown that the neural tangent kernel (NTK) approach, when applied to federated learning in a centralized framework, can lead to improved performance.
We propose an approach leveraging the NTK to train client models in the decentralized setting, while introducing a synergy between NTK-based evolution and model averaging.
arXiv Detail & Related papers (2024-10-02T18:19:28Z) - Efficient Cluster Selection for Personalized Federated Learning: A
Multi-Armed Bandit Approach [2.5477011559292175]
Federated learning (FL) offers a decentralized training approach for machine learning models, prioritizing data privacy.
In this paper, we introduce a dynamic Upper Confidence Bound (dUCB) algorithm inspired by the multi-armed bandit (MAB) approach.
arXiv Detail & Related papers (2023-10-29T16:46:50Z) - Cross-feature Contrastive Loss for Decentralized Deep Learning on
Heterogeneous Data [8.946847190099206]
We present a novel approach for decentralized learning on heterogeneous data.
Cross-features for a pair of neighboring agents are the features obtained from the data of an agent with respect to the model parameters of the other agent.
Our experiments show that the proposed method achieves superior performance (0.2-4% improvement in test accuracy) compared to other existing techniques for decentralized learning on heterogeneous data.
arXiv Detail & Related papers (2023-10-24T14:48:23Z) - FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup
for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method.
We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate.
We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z) - Global Update Tracking: A Decentralized Learning Algorithm for
Heterogeneous Data [14.386062807300666]
In this paper, we focus on designing a decentralized learning algorithm that is less susceptible to variations in data distribution across devices.
We propose Global Update Tracking (GUT), a novel tracking-based method that aims to mitigate the impact of heterogeneous data in decentralized learning without introducing any communication overhead.
Our experiments show that the proposed method achieves state-of-the-art performance for decentralized learning on heterogeneous data via a $1-6%$ improvement in test accuracy compared to other existing techniques.
arXiv Detail & Related papers (2023-05-08T15:48:53Z) - Straggler-Resilient Personalized Federated Learning [55.54344312542944]
Federated learning allows training models from samples distributed across a large network of clients while respecting privacy and communication restrictions.
We develop a novel algorithmic procedure with theoretical speedup guarantees that simultaneously handles two of these hurdles.
Our method relies on ideas from representation learning theory to find a global common representation using all clients' data and learn a user-specific set of parameters leading to a personalized solution for each client.
arXiv Detail & Related papers (2022-06-05T01:14:46Z) - DRFLM: Distributionally Robust Federated Learning with Inter-client
Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data.
We propose a general framework to solve the above two challenges simultaneously.
We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - Decentralized federated learning of deep neural networks on non-iid data [0.6335848702857039]
We tackle the non-problem of learning a personalized deep learning model in a decentralized setting.
We propose a method named Performance-Based Neighbor Selection (PENS) where clients with similar data detect each other and cooperate.
PENS is able to achieve higher accuracies as compared to strong baselines.
arXiv Detail & Related papers (2021-07-18T19:05:44Z) - Exploiting Shared Representations for Personalized Federated Learning [54.65133770989836]
We propose a novel federated learning framework and algorithm for learning a shared data representation across clients and unique local heads for each client.
Our algorithm harnesses the distributed computational power across clients to perform many local-updates with respect to the low-dimensional local parameters for every update of the representation.
This result is of interest beyond federated learning to a broad class of problems in which we aim to learn a shared low-dimensional representation among data distributions.
arXiv Detail & Related papers (2021-02-14T05:36:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.