Averaging Rate Scheduler for Decentralized Learning on Heterogeneous
Data
- URL: http://arxiv.org/abs/2403.03292v1
- Date: Tue, 5 Mar 2024 19:47:51 GMT
- Title: Averaging Rate Scheduler for Decentralized Learning on Heterogeneous
Data
- Authors: Sai Aparna Aketi, Sakshi Choudhary, Kaushik Roy
- Abstract summary: State-of-the-art decentralized learning algorithms typically require the data distribution to be Independent and Identically Distributed (IID)
We propose averaging rate scheduling as a simple yet effective way to reduce the impact of heterogeneity in decentralized learning.
- Score: 7.573297026523597
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: State-of-the-art decentralized learning algorithms typically require the data
distribution to be Independent and Identically Distributed (IID). However, in
practical scenarios, the data distribution across the agents can have
significant heterogeneity. In this work, we propose averaging rate scheduling
as a simple yet effective way to reduce the impact of heterogeneity in
decentralized learning. Our experiments illustrate the superiority of the
proposed method (~3% improvement in test accuracy) compared to the conventional
approach of employing a constant averaging rate.
Related papers
- Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Distribution-Free Fair Federated Learning with Small Samples [54.63321245634712]
FedFaiREE is a post-processing algorithm developed specifically for distribution-free fair learning in decentralized settings with small samples.
We provide rigorous theoretical guarantees for both fairness and accuracy, and our experimental results further provide robust empirical validation for our proposed method.
arXiv Detail & Related papers (2024-02-25T17:37:53Z) - Stability and Generalization of the Decentralized Stochastic Gradient
Descent Ascent Algorithm [80.94861441583275]
We investigate the complexity of the generalization bound of the decentralized gradient descent (D-SGDA) algorithm.
Our results analyze the impact of different top factors on the generalization of D-SGDA.
We also balance it with the generalization to obtain the optimal convex-concave setting.
arXiv Detail & Related papers (2023-10-31T11:27:01Z) - Cross-feature Contrastive Loss for Decentralized Deep Learning on
Heterogeneous Data [8.946847190099206]
We present a novel approach for decentralized learning on heterogeneous data.
Cross-features for a pair of neighboring agents are the features obtained from the data of an agent with respect to the model parameters of the other agent.
Our experiments show that the proposed method achieves superior performance (0.2-4% improvement in test accuracy) compared to other existing techniques for decentralized learning on heterogeneous data.
arXiv Detail & Related papers (2023-10-24T14:48:23Z) - Exact Subspace Diffusion for Decentralized Multitask Learning [17.592204922442832]
Distributed strategies for multitask learning induce relationships between agents in a more nuanced manner, and encourage collaboration without enforcing consensus.
We develop a generalization of the exact diffusion algorithm for subspace constrained multitask learning over networks, and derive an accurate expression for its mean-squared deviation.
We verify numerically the accuracy of the predicted performance expressions, as well as the improved performance of the proposed approach over alternatives based on approximate projections.
arXiv Detail & Related papers (2023-04-14T19:42:19Z) - An Experimental Study of Data Heterogeneity in Federated Learning
Methods for Medical Imaging [8.984706828657814]
Federated learning enables multiple institutions to collaboratively train machine learning models on their local data in a privacy-preserving way.
We investigate the deleterious impact of a taxonomy of data heterogeneity regimes on federated learning methods, including quantity skew, label distribution skew, and imaging acquisition skew.
We present several mitigation strategies to overcome performance drops from data heterogeneity, including weighted average for data quantity skew, weighted loss and batch normalization averaging for label distribution skew.
arXiv Detail & Related papers (2021-07-18T05:47:48Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - Distributionally Robust Learning in Heterogeneous Contexts [29.60681287631439]
We consider the problem of learning from training data obtained in different contexts, where the test data is subject to distributional shifts.
We develop a distributionally robust method that focuses on excess risks and achieves a more appropriate trade-off between performance and robustness than the conventional and overly conservative minimax approach.
arXiv Detail & Related papers (2021-05-18T14:00:34Z) - Quasi-Global Momentum: Accelerating Decentralized Deep Learning on
Heterogeneous Data [77.88594632644347]
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks.
In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge.
We propose a novel momentum-based method to mitigate this decentralized training difficulty.
arXiv Detail & Related papers (2021-02-09T11:27:14Z) - Adaptive Serverless Learning [114.36410688552579]
We propose a novel adaptive decentralized training approach, which can compute the learning rate from data dynamically.
Our theoretical results reveal that the proposed algorithm can achieve linear speedup with respect to the number of workers.
To reduce the communication-efficient overhead, we further propose a communication-efficient adaptive decentralized training approach.
arXiv Detail & Related papers (2020-08-24T13:23:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.