Related papers: Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data

Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data

URL: http://arxiv.org/abs/2102.04761v1
Date: Tue, 9 Feb 2021 11:27:14 GMT
Title: Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data
Authors: Tao Lin, Sai Praneeth Karimireddy, Sebastian U. Stich, Martin Jaggi
Abstract summary: Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks. In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge. We propose a novel momentum-based method to mitigate this decentralized training difficulty.
Score: 77.88594632644347
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks. In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge and may severely deteriorate the generalization performance. In this paper, we investigate and identify the limitation of several decentralized optimization algorithms for different degrees of data heterogeneity. We propose a novel momentum-based method to mitigate this decentralized training difficulty. We show in extensive empirical experiments on various CV/NLP datasets (CIFAR-10, ImageNet, AG News, and SST2) and several network topologies (Ring and Social Network) that our method is much more robust to the heterogeneity of clients' data than other existing methods, by a significant improvement in test performance ($1\% \!-\! 20\%$).

Related papers

Client Selection in Federated Learning with Data Heterogeneity and Network Latencies [19.161254709653914]
Federated learning (FL) is a distributed machine learning paradigm where multiple clients conduct local training based on their private data, then the updated models are sent to a central server for global aggregation. In this paper, we propose two novel theoretically optimal client selection schemes that handle both these heterogeneities.
arXiv Detail & Related papers (2025-04-02T17:31:15Z)
FedDUAL: A Dual-Strategy with Adaptive Loss and Dynamic Aggregation for Mitigating Data Heterogeneity in Federated Learning [12.307490659840845]
Federated Learning (FL) combines locally optimized models from various clients into a unified global model. FL encounters significant challenges such as performance degradation, slower convergence, and reduced robustness of the global model. We introduce an innovative dual-strategy approach designed to effectively resolve these issues.
arXiv Detail & Related papers (2024-12-05T18:42:29Z)
NTK-DFL: Enhancing Decentralized Federated Learning in Heterogeneous Settings via Neural Tangent Kernel [27.92271597111756]
Decentralized federated learning (DFL) is a collaborative machine learning framework for training a model across participants without a central server or raw data exchange. Recent work has shown that the neural tangent kernel (NTK) approach, when applied to federated learning in a centralized framework, can lead to improved performance. We propose an approach leveraging the NTK to train client models in the decentralized setting, while introducing a synergy between NTK-based evolution and model averaging.
arXiv Detail & Related papers (2024-10-02T18:19:28Z)
Efficient Cluster Selection for Personalized Federated Learning: A Multi-Armed Bandit Approach [2.5477011559292175]
Federated learning (FL) offers a decentralized training approach for machine learning models, prioritizing data privacy. In this paper, we introduce a dynamic Upper Confidence Bound (dUCB) algorithm inspired by the multi-armed bandit (MAB) approach.
arXiv Detail & Related papers (2023-10-29T16:46:50Z)
Cross-feature Contrastive Loss for Decentralized Deep Learning on Heterogeneous Data [8.946847190099206]
We present a novel approach for decentralized learning on heterogeneous data. Cross-features for a pair of neighboring agents are the features obtained from the data of an agent with respect to the model parameters of the other agent. Our experiments show that the proposed method achieves superior performance (0.2-4% improvement in test accuracy) compared to other existing techniques for decentralized learning on heterogeneous data.
arXiv Detail & Related papers (2023-10-24T14:48:23Z)
FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method. We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate. We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z)
Global Update Tracking: A Decentralized Learning Algorithm for Heterogeneous Data [14.386062807300666]
In this paper, we focus on designing a decentralized learning algorithm that is less susceptible to variations in data distribution across devices. We propose Global Update Tracking (GUT), a novel tracking-based method that aims to mitigate the impact of heterogeneous data in decentralized learning without introducing any communication overhead. Our experiments show that the proposed method achieves state-of-the-art performance for decentralized learning on heterogeneous data via a $1-6%$ improvement in test accuracy compared to other existing techniques.
arXiv Detail & Related papers (2023-05-08T15:48:53Z)
Straggler-Resilient Personalized Federated Learning [55.54344312542944]
Federated learning allows training models from samples distributed across a large network of clients while respecting privacy and communication restrictions. We develop a novel algorithmic procedure with theoretical speedup guarantees that simultaneously handles two of these hurdles. Our method relies on ideas from representation learning theory to find a global common representation using all clients' data and learn a user-specific set of parameters leading to a personalized solution for each client.
arXiv Detail & Related papers (2022-06-05T01:14:46Z)
DRFLM: Distributionally Robust Federated Learning with Inter-client Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data. We propose a general framework to solve the above two challenges simultaneously. We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z)
Local Learning Matters: Rethinking Data Heterogeneity in Federated Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z)
Decentralized federated learning of deep neural networks on non-iid data [0.6335848702857039]
We tackle the non-problem of learning a personalized deep learning model in a decentralized setting. We propose a method named Performance-Based Neighbor Selection (PENS) where clients with similar data detect each other and cooperate. PENS is able to achieve higher accuracies as compared to strong baselines.
arXiv Detail & Related papers (2021-07-18T19:05:44Z)
Exploiting Shared Representations for Personalized Federated Learning [54.65133770989836]
We propose a novel federated learning framework and algorithm for learning a shared data representation across clients and unique local heads for each client. Our algorithm harnesses the distributed computational power across clients to perform many local-updates with respect to the low-dimensional local parameters for every update of the representation. This result is of interest beyond federated learning to a broad class of problems in which we aim to learn a shared low-dimensional representation among data distributions.
arXiv Detail & Related papers (2021-02-14T05:36:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.