COKE: Communication-Censored Decentralized Kernel Learning
- URL: http://arxiv.org/abs/2001.10133v2
- Date: Wed, 30 Jun 2021 00:52:48 GMT
- Title: COKE: Communication-Censored Decentralized Kernel Learning
- Authors: Ping Xu, Yue Wang, Xiang Chen, Zhi Tian
- Abstract summary: Multiple interconnected agents aim to learn an optimal decision function defined over a reproducing kernel Hilbert space by jointly minimizing a global objective function.
As a non-parametric approach, kernel iteration learning faces a major challenge in distributed implementation.
We develop a communication-censored kernel learning (COKE) algorithm that reduces the communication load of DKLA by preventing an agent from transmitting at every generalization unless its local updates are deemed informative.
- Score: 30.795725108364724
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the decentralized optimization and learning problem where
multiple interconnected agents aim to learn an optimal decision function
defined over a reproducing kernel Hilbert space by jointly minimizing a global
objective function, with access to their own locally observed dataset. As a
non-parametric approach, kernel learning faces a major challenge in distributed
implementation: the decision variables of local objective functions are
data-dependent and thus cannot be optimized under the decentralized consensus
framework without any raw data exchange among agents. To circumvent this major
challenge, we leverage the random feature (RF) approximation approach to enable
consensus on the function modeled in the RF space by data-independent
parameters across different agents. We then design an iterative algorithm,
termed DKLA, for fast-convergent implementation via ADMM. Based on DKLA, we
further develop a communication-censored kernel learning (COKE) algorithm that
reduces the communication load of DKLA by preventing an agent from transmitting
at every iteration unless its local updates are deemed informative. Theoretical
results in terms of linear convergence guarantee and generalization performance
analysis of DKLA and COKE are provided. Comprehensive tests on both synthetic
and real datasets are conducted to verify the communication efficiency and
learning effectiveness of COKE.
Related papers
- Stability and Generalization for Distributed SGDA [70.97400503482353]
We propose the stability-based generalization analytical framework for Distributed-SGDA.
We conduct a comprehensive analysis of stability error, generalization gap, and population risk across different metrics.
Our theoretical results reveal the trade-off between the generalization gap and optimization error.
arXiv Detail & Related papers (2024-11-14T11:16:32Z) - Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration [66.43954501171292]
We introduce Catalyst Acceleration and propose an acceleration Decentralized Federated Learning algorithm called DFedCata.
DFedCata consists of two main components: the Moreau envelope function, which addresses parameter inconsistencies, and Nesterov's extrapolation step, which accelerates the aggregation phase.
Empirically, we demonstrate the advantages of the proposed algorithm in both convergence speed and generalization performance on CIFAR10/100 with various non-iid data distributions.
arXiv Detail & Related papers (2024-10-09T06:17:16Z) - CoDeC: Communication-Efficient Decentralized Continual Learning [6.663641564969944]
Training at the edge utilizes continuously evolving data generated at different locations.
Privacy concerns prohibit the co-location of this spatially as well as temporally distributed data.
We propose CoDeC, a novel communication-efficient decentralized continual learning algorithm.
arXiv Detail & Related papers (2023-03-27T16:52:17Z) - Magnitude Matters: Fixing SIGNSGD Through Magnitude-Aware Sparsification
in the Presence of Data Heterogeneity [60.791736094073]
Communication overhead has become one of the major bottlenecks in the distributed training of deep neural networks.
We propose a magnitude-driven sparsification scheme, which addresses the non-convergence issue of SIGNSGD.
The proposed scheme is validated through experiments on Fashion-MNIST, CIFAR-10, and CIFAR-100 datasets.
arXiv Detail & Related papers (2023-02-19T17:42:35Z) - QC-ODKLA: Quantized and Communication-Censored Online Decentralized
Kernel Learning via Linearized ADMM [30.795725108364724]
This paper focuses on online kernel learning over a decentralized network.
We propose a novel learning framework named Online Decentralized Kernel learning via Linearized ADMM.
arXiv Detail & Related papers (2022-08-04T17:16:27Z) - FedDKD: Federated Learning with Decentralized Knowledge Distillation [3.9084449541022055]
We propose a novel framework of federated learning equipped with the process of decentralized knowledge distillation (FedDKD)
We show that FedDKD outperforms the state-of-the-art methods with more efficient communication and training in a few DKD steps.
arXiv Detail & Related papers (2022-05-02T07:54:07Z) - Escaping Saddle Points with Bias-Variance Reduced Local Perturbed SGD
for Communication Efficient Nonconvex Distributed Learning [58.79085525115987]
Local methods are one of the promising approaches to reduce communication time.
We show that the communication complexity is better than non-local methods when the local datasets is smaller than the smoothness local loss.
arXiv Detail & Related papers (2022-02-12T15:12:17Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - Cross-Gradient Aggregation for Decentralized Learning from Non-IID data [34.23789472226752]
Decentralized learning enables a group of collaborative agents to learn models using a distributed dataset without the need for a central parameter server.
We propose Cross-Gradient Aggregation (CGA), a novel decentralized learning algorithm.
We show superior learning performance of CGA over existing state-of-the-art decentralized learning algorithms.
arXiv Detail & Related papers (2021-03-02T21:58:12Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.