DRACO: Decentralized Asynchronous Federated Learning over Continuous Row-Stochastic Network Matrices
- URL: http://arxiv.org/abs/2406.13533v1
- Date: Wed, 19 Jun 2024 13:17:28 GMT
- Title: DRACO: Decentralized Asynchronous Federated Learning over Continuous Row-Stochastic Network Matrices
- Authors: Eunjeong Jeong, Marios Kountouris,
- Abstract summary: We propose DRACO, a novel method for decentralized asynchronous Descent (SGD) over row-stochastic gossip wireless networks.
Our approach enables edge devices within decentralized networks to perform local training and model exchanging along a continuous timeline.
Our numerical experiments corroborate the efficacy of the proposed technique.
- Score: 7.389425875982468
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent developments and emerging use cases, such as smart Internet of Things (IoT) and Edge AI, have sparked considerable interest in the training of neural networks over fully decentralized (serverless) networks. One of the major challenges of decentralized learning is to ensure stable convergence without resorting to strong assumptions applied for each agent regarding data distributions or updating policies. To address these issues, we propose DRACO, a novel method for decentralized asynchronous Stochastic Gradient Descent (SGD) over row-stochastic gossip wireless networks by leveraging continuous communication. Our approach enables edge devices within decentralized networks to perform local training and model exchanging along a continuous timeline, thereby eliminating the necessity for synchronized timing. The algorithm also features a specific technique of decoupling communication and computation schedules, which empowers complete autonomy for all users and manageable instructions for stragglers. Through a comprehensive convergence analysis, we highlight the advantages of asynchronous and autonomous participation in decentralized optimization. Our numerical experiments corroborate the efficacy of the proposed technique.
Related papers
- Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [94.2860766709971]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents.
Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z) - Decentralized Learning over Wireless Networks: The Effect of Broadcast
with Random Access [56.91063444859008]
We investigate the impact of broadcast transmission and probabilistic random access policy on the convergence performance of D-SGD.
Our results demonstrate that optimizing the access probability to maximize the expected number of successful links is a highly effective strategy for accelerating the system convergence.
arXiv Detail & Related papers (2023-05-12T10:32:26Z) - Decentralized Learning Made Easy with DecentralizePy [3.1848820580333737]
Decentralized learning (DL) has gained prominence for its potential benefits in terms of scalability, privacy, and fault tolerance.
We propose DecentralizePy, a distributed framework for decentralized ML, which allows for the emulation of large-scale learning networks in arbitrary topologies.
We demonstrate the capabilities of DecentralizePy by deploying techniques such as sparsification and secure aggregation on top of several topologies.
arXiv Detail & Related papers (2023-04-17T14:42:33Z) - UAV-Aided Decentralized Learning over Mesh Networks [23.612400109629544]
Decentralized learning empowers wireless network devices to collaboratively train a machine learning (ML) model relying solely on device-to-device (D2D) communication.
Local connectivity of real world mesh networks, due to the limited communication range of its wireless nodes, undermines the efficiency of decentralized learning protocols.
We propose an optimized UAV trajectory, that is defined as a sequence of waypoints that the UAV visits sequentially in order to transfer intelligence across sparsely connected group of users.
arXiv Detail & Related papers (2022-03-02T10:39:40Z) - Asynchronous Decentralized Learning over Unreliable Wireless Networks [4.630093015127539]
Decentralized learning enables edge users to collaboratively train models by exchanging information via device-to-device communication.
We propose an asynchronous decentralized gradient descent (DSGD) algorithm, which is robust to the inherent and communication failures occurring at the wireless network edge.
Experimental results corroborate our analysis, demonstrating the benefits of asynchronicity and outdated gradient information reuse in decentralized learning over unreliable wireless networks.
arXiv Detail & Related papers (2022-02-02T11:00:49Z) - Finite-Time Consensus Learning for Decentralized Optimization with
Nonlinear Gossiping [77.53019031244908]
We present a novel decentralized learning framework based on nonlinear gossiping (NGO), that enjoys an appealing finite-time consensus property to achieve better synchronization.
Our analysis on how communication delay and randomized chats affect learning further enables the derivation of practical variants.
arXiv Detail & Related papers (2021-11-04T15:36:25Z) - Learning Autonomy in Management of Wireless Random Networks [102.02142856863563]
This paper presents a machine learning strategy that tackles a distributed optimization task in a wireless network with an arbitrary number of randomly interconnected nodes.
We develop a flexible deep neural network formalism termed distributed message-passing neural network (DMPNN) with forward and backward computations independent of the network topology.
arXiv Detail & Related papers (2021-06-15T09:03:28Z) - Accelerating Neural Network Training with Distributed Asynchronous and
Selective Optimization (DASO) [0.0]
We introduce the Distributed Asynchronous and Selective Optimization (DASO) method to accelerate network training.
DASO uses a hierarchical and asynchronous communication scheme comprised of node-local and global networks.
We show that DASO yields a reduction in training time of up to 34% on classical and state-of-the-art networks.
arXiv Detail & Related papers (2021-04-12T16:02:20Z) - Reinforcement Learning for Datacenter Congestion Control [50.225885814524304]
Successful congestion control algorithms can dramatically improve latency and overall network throughput.
Until today, no such learning-based algorithms have shown practical potential in this domain.
We devise an RL-based algorithm with the aim of generalizing to different configurations of real-world datacenter networks.
We show that this scheme outperforms alternative popular RL approaches, and generalizes to scenarios that were not seen during training.
arXiv Detail & Related papers (2021-02-18T13:49:28Z) - Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search.
We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.