CoDeC: Communication-Efficient Decentralized Continual Learning
- URL: http://arxiv.org/abs/2303.15378v1
- Date: Mon, 27 Mar 2023 16:52:17 GMT
- Title: CoDeC: Communication-Efficient Decentralized Continual Learning
- Authors: Sakshi Choudhary, Sai Aparna Aketi, Gobinda Saha and Kaushik Roy
- Abstract summary: Training at the edge utilizes continuously evolving data generated at different locations.
Privacy concerns prohibit the co-location of this spatially as well as temporally distributed data.
We propose CoDeC, a novel communication-efficient decentralized continual learning algorithm.
- Score: 6.663641564969944
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training at the edge utilizes continuously evolving data generated at
different locations. Privacy concerns prohibit the co-location of this
spatially as well as temporally distributed data, deeming it crucial to design
training algorithms that enable efficient continual learning over decentralized
private data. Decentralized learning allows serverless training with spatially
distributed data. A fundamental barrier in such distributed learning is the
high bandwidth cost of communicating model updates between agents. Moreover,
existing works under this training paradigm are not inherently suitable for
learning a temporal sequence of tasks while retaining the previously acquired
knowledge. In this work, we propose CoDeC, a novel communication-efficient
decentralized continual learning algorithm which addresses these challenges. We
mitigate catastrophic forgetting while learning a task sequence in a
decentralized learning setup by combining orthogonal gradient projection with
gossip averaging across decentralized agents. Further, CoDeC includes a novel
lossless communication compression scheme based on the gradient subspaces. We
express layer-wise gradients as a linear combination of the basis vectors of
these gradient subspaces and communicate the associated coefficients. We
theoretically analyze the convergence rate for our algorithm and demonstrate
through an extensive set of experiments that CoDeC successfully learns
distributed continual tasks with minimal forgetting. The proposed compression
scheme results in up to 4.8x reduction in communication costs with
iso-performance as the full communication baseline.
Related papers
- DRACO: Decentralized Asynchronous Federated Learning over Continuous Row-Stochastic Network Matrices [7.389425875982468]
We propose DRACO, a novel method for decentralized asynchronous Descent (SGD) over row-stochastic gossip wireless networks.
Our approach enables edge devices within decentralized networks to perform local training and model exchanging along a continuous timeline.
Our numerical experiments corroborate the efficacy of the proposed technique.
arXiv Detail & Related papers (2024-06-19T13:17:28Z) - Communication-Efficient Decentralized Federated Learning via One-Bit
Compressive Sensing [52.402550431781805]
Decentralized federated learning (DFL) has gained popularity due to its practicality across various applications.
Compared to the centralized version, training a shared model among a large number of nodes in DFL is more challenging.
We develop a novel algorithm based on the framework of the inexact alternating direction method (iADM)
arXiv Detail & Related papers (2023-08-31T12:22:40Z) - Online Distributed Learning with Quantized Finite-Time Coordination [0.4910937238451484]
In our setting a set of agents need to cooperatively train a learning model from streaming data.
We propose a distributed algorithm that relies on a quantized, finite-time coordination protocol.
We analyze the performance of the proposed algorithm in terms of the mean distance from the online solution.
arXiv Detail & Related papers (2023-07-13T08:36:15Z) - Magnitude Matters: Fixing SIGNSGD Through Magnitude-Aware Sparsification
in the Presence of Data Heterogeneity [60.791736094073]
Communication overhead has become one of the major bottlenecks in the distributed training of deep neural networks.
We propose a magnitude-driven sparsification scheme, which addresses the non-convergence issue of SIGNSGD.
The proposed scheme is validated through experiments on Fashion-MNIST, CIFAR-10, and CIFAR-100 datasets.
arXiv Detail & Related papers (2023-02-19T17:42:35Z) - On Generalization of Decentralized Learning with Separable Data [37.908159361149835]
We study algorithmic and generalization properties of decentralized learning with gradient descent on separable data.
Specifically, for decentralized gradient descent and a variety of loss functions that asymptote to zero at infinity, we derive novel finite-time generalization bounds.
arXiv Detail & Related papers (2022-09-15T07:59:05Z) - QC-ODKLA: Quantized and Communication-Censored Online Decentralized
Kernel Learning via Linearized ADMM [30.795725108364724]
This paper focuses on online kernel learning over a decentralized network.
We propose a novel learning framework named Online Decentralized Kernel learning via Linearized ADMM.
arXiv Detail & Related papers (2022-08-04T17:16:27Z) - RelaySum for Decentralized Deep Learning on Heterogeneous Data [71.36228931225362]
In decentralized machine learning, workers compute model updates on their local data.
Because the workers only communicate with few neighbors without central coordination, these updates propagate progressively over the network.
This paradigm enables distributed training on networks without all-to-all connectivity, helping to protect data privacy as well as to reduce the communication cost of distributed training in data centers.
arXiv Detail & Related papers (2021-10-08T14:55:32Z) - Sparse-Push: Communication- & Energy-Efficient Decentralized Distributed
Learning over Directed & Time-Varying Graphs with non-IID Datasets [2.518955020930418]
We propose Sparse-Push, a communication efficient decentralized distributed training algorithm.
The proposed algorithm enables 466x reduction in communication with only 1% degradation in performance.
We demonstrate how communication compression can lead to significant performance degradation in-case of non-IID datasets.
arXiv Detail & Related papers (2021-02-10T19:41:11Z) - CosSGD: Nonlinear Quantization for Communication-efficient Federated
Learning [62.65937719264881]
Federated learning facilitates learning across clients without transferring local data on these clients to a central server.
We propose a nonlinear quantization for compressed gradient descent, which can be easily utilized in federated learning.
Our system significantly reduces the communication cost by up to three orders of magnitude, while maintaining convergence and accuracy of the training process.
arXiv Detail & Related papers (2020-12-15T12:20:28Z) - A Low Complexity Decentralized Neural Net with Centralized Equivalence
using Layer-wise Learning [49.15799302636519]
We design a low complexity decentralized learning algorithm to train a recently proposed large neural network in distributed processing nodes (workers)
In our setup, the training data is distributed among the workers but is not shared in the training process due to privacy and security concerns.
We show that it is possible to achieve equivalent learning performance as if the data is available in a single place.
arXiv Detail & Related papers (2020-09-29T13:08:12Z) - Adaptive Serverless Learning [114.36410688552579]
We propose a novel adaptive decentralized training approach, which can compute the learning rate from data dynamically.
Our theoretical results reveal that the proposed algorithm can achieve linear speedup with respect to the number of workers.
To reduce the communication-efficient overhead, we further propose a communication-efficient adaptive decentralized training approach.
arXiv Detail & Related papers (2020-08-24T13:23:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.