Related papers: Cross-Gradient Aggregation for Decentralized Learning from Non-IID data

Cross-Gradient Aggregation for Decentralized Learning from Non-IID data

URL: http://arxiv.org/abs/2103.02051v1
Date: Tue, 2 Mar 2021 21:58:12 GMT
Title: Cross-Gradient Aggregation for Decentralized Learning from Non-IID data
Authors: Yasaman Esfandiari, Sin Yong Tan, Zhanhong Jiang, Aditya Balu, Ethan Herron, Chinmay Hegde, Soumik Sarkar
Abstract summary: Decentralized learning enables a group of collaborative agents to learn models using a distributed dataset without the need for a central parameter server. We propose Cross-Gradient Aggregation (CGA), a novel decentralized learning algorithm. We show superior learning performance of CGA over existing state-of-the-art decentralized learning algorithms.
Score: 34.23789472226752
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Decentralized learning enables a group of collaborative agents to learn models using a distributed dataset without the need for a central parameter server. Recently, decentralized learning algorithms have demonstrated state-of-the-art results on benchmark data sets, comparable with centralized algorithms. However, the key assumption to achieve competitive performance is that the data is independently and identically distributed (IID) among the agents which, in real-life applications, is often not applicable. Inspired by ideas from continual learning, we propose Cross-Gradient Aggregation (CGA), a novel decentralized learning algorithm where (i) each agent aggregates cross-gradient information, i.e., derivatives of its model with respect to its neighbors' datasets, and (ii) updates its model using a projected gradient based on quadratic programming (QP). We theoretically analyze the convergence characteristics of CGA and demonstrate its efficiency on non-IID data distributions sampled from the MNIST and CIFAR-10 datasets. Our empirical comparisons show superior learning performance of CGA over existing state-of-the-art decentralized learning algorithms, as well as maintaining the improved performance under information compression to reduce peer-to-peer communication overhead.

Related papers

Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration [66.43954501171292]
We introduce Catalyst Acceleration and propose an acceleration Decentralized Federated Learning algorithm called DFedCata. DFedCata consists of two main components: the Moreau envelope function, which addresses parameter inconsistencies, and Nesterov's extrapolation step, which accelerates the aggregation phase. Empirically, we demonstrate the advantages of the proposed algorithm in both convergence speed and generalization performance on CIFAR10/100 with various non-iid data distributions.
arXiv Detail & Related papers (2024-10-09T06:17:16Z)
DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models [21.85879890198875]
Decentralized Iterative Merging-And-Training (DIMAT) is a novel decentralized deep learning algorithm. We show that DIMAT attains faster and higher initial gain in accuracy with independent and identically distributed (IID) and non-IID data, incurring lower communication overhead. This DIMAT paradigm presents a new opportunity for the future decentralized learning, enhancing its adaptability to real-world with sparse lightweight communication computation.
arXiv Detail & Related papers (2024-04-11T18:34:29Z)
Deep Contrastive Graph Learning with Clustering-Oriented Guidance [61.103996105756394]
Graph Convolutional Network (GCN) has exhibited remarkable potential in improving graph-based clustering. Models estimate an initial graph beforehand to apply GCN. Deep Contrastive Graph Learning (DCGL) model is proposed for general data clustering.
arXiv Detail & Related papers (2024-02-25T07:03:37Z)
Cross-feature Contrastive Loss for Decentralized Deep Learning on Heterogeneous Data [8.946847190099206]
We present a novel approach for decentralized learning on heterogeneous data. Cross-features for a pair of neighboring agents are the features obtained from the data of an agent with respect to the model parameters of the other agent. Our experiments show that the proposed method achieves superior performance (0.2-4% improvement in test accuracy) compared to other existing techniques for decentralized learning on heterogeneous data.
arXiv Detail & Related papers (2023-10-24T14:48:23Z)
Structured Cooperative Learning with Graphical Model Priors [98.53322192624594]
We study how to train personalized models for different tasks on decentralized devices with limited local data. We propose "Structured Cooperative Learning (SCooL)", in which a cooperation graph across devices is generated by a graphical model. We evaluate SCooL and compare it with existing decentralized learning methods on an extensive set of benchmarks.
arXiv Detail & Related papers (2023-06-16T02:41:31Z)
Global Update Tracking: A Decentralized Learning Algorithm for Heterogeneous Data [14.386062807300666]
In this paper, we focus on designing a decentralized learning algorithm that is less susceptible to variations in data distribution across devices. We propose Global Update Tracking (GUT), a novel tracking-based method that aims to mitigate the impact of heterogeneous data in decentralized learning without introducing any communication overhead. Our experiments show that the proposed method achieves state-of-the-art performance for decentralized learning on heterogeneous data via a $1-6%$ improvement in test accuracy compared to other existing techniques.
arXiv Detail & Related papers (2023-05-08T15:48:53Z)
Neighborhood Gradient Clustering: An Efficient Decentralized Learning Method for Non-IID Data Distributions [5.340730281227837]
The current state-of-the-art decentralized algorithms mostly assume the data distributions to be Independent and Identically Distributed. We propose textitNeighborhood Gradient Clustering (NGC), a novel decentralized learning algorithm that modifies the local gradients of each agent using self- and cross-gradient information.
arXiv Detail & Related papers (2022-09-28T19:28:54Z)
Federated XGBoost on Sample-Wise Non-IID Data [8.49189353769386]
Decision tree-based models, in particular XGBoost, can handle non-IID data. This paper investigates the effects of how Federated XGBoost is impacted by non-IID distributions.
arXiv Detail & Related papers (2022-09-03T06:14:20Z)
Data-heterogeneity-aware Mixing for Decentralized Learning [63.83913592085953]
We characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes. We propose a metric that quantifies the ability of a graph to mix the current gradients. Motivated by our analysis, we propose an approach that periodically and efficiently optimize the metric.
arXiv Detail & Related papers (2022-04-13T15:54:35Z)
Reinforcement Learning for Datacenter Congestion Control [50.225885814524304]
Successful congestion control algorithms can dramatically improve latency and overall network throughput. Until today, no such learning-based algorithms have shown practical potential in this domain. We devise an RL-based algorithm with the aim of generalizing to different configurations of real-world datacenter networks. We show that this scheme outperforms alternative popular RL approaches, and generalizes to scenarios that were not seen during training.
arXiv Detail & Related papers (2021-02-18T13:49:28Z)
Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data [77.88594632644347]
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks. In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge. We propose a novel momentum-based method to mitigate this decentralized training difficulty.
arXiv Detail & Related papers (2021-02-09T11:27:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.