Related papers: The Entrapment Problem in Random Walk Decentralized Learning

The Entrapment Problem in Random Walk Decentralized Learning

URL: http://arxiv.org/abs/2407.20611v1
Date: Tue, 30 Jul 2024 07:36:13 GMT
Title: The Entrapment Problem in Random Walk Decentralized Learning
Authors: Zonghong Liu, Salim El Rouayheb, Matthew Dwyer,
Abstract summary: We investigate a decentralized SGD algorithm that utilizes a random walk to update a global model based on local data. Our focus is on designing the transition probability matrix to speed up convergence. We propose the Metropolis-Hastings with L'evy Jumps (MHLJ) algorithm, which incorporates random perturbations (jumps) to overcome entrapment.
Score: 10.355835466049088
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper explores decentralized learning in a graph-based setting, where data is distributed across nodes. We investigate a decentralized SGD algorithm that utilizes a random walk to update a global model based on local data. Our focus is on designing the transition probability matrix to speed up convergence. While importance sampling can enhance centralized learning, its decentralized counterpart, using the Metropolis-Hastings (MH) algorithm, can lead to the entrapment problem, where the random walk becomes stuck at certain nodes, slowing convergence. To address this, we propose the Metropolis-Hastings with L\'evy Jumps (MHLJ) algorithm, which incorporates random perturbations (jumps) to overcome entrapment. We theoretically establish the convergence rate and error gap of MHLJ and validate our findings through numerical experiments.

Related papers

Stability and Generalization of the Decentralized Stochastic Gradient Descent Ascent Algorithm [80.94861441583275]
We investigate the complexity of the generalization bound of the decentralized gradient descent (D-SGDA) algorithm. Our results analyze the impact of different top factors on the generalization of D-SGDA. We also balance it with the generalization to obtain the optimal convex-concave setting.
arXiv Detail & Related papers (2023-10-31T11:27:01Z)
Communication-Efficient Decentralized Federated Learning via One-Bit Compressive Sensing [52.402550431781805]
Decentralized federated learning (DFL) has gained popularity due to its practicality across various applications. Compared to the centralized version, training a shared model among a large number of nodes in DFL is more challenging. We develop a novel algorithm based on the framework of the inexact alternating direction method (iADM)
arXiv Detail & Related papers (2023-08-31T12:22:40Z)
Walk for Learning: A Random Walk Approach for Federated Learning from Heterogeneous Data [17.978941229970886]
We focus on Federated Learning (FL) as a canonical application. One of the main challenges of FL is the communication bottleneck between the nodes and the parameter server. We present an adaptive random walk learning algorithm.
arXiv Detail & Related papers (2022-06-01T19:53:24Z)
Decentralized Local Stochastic Extra-Gradient for Variational Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices. We make a very general assumption on the computational network that covers the settings of fully decentralized calculations. We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z)
Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles. Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center. We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes. A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z)
A Low Complexity Decentralized Neural Net with Centralized Equivalence using Layer-wise Learning [49.15799302636519]
We design a low complexity decentralized learning algorithm to train a recently proposed large neural network in distributed processing nodes (workers) In our setup, the training data is distributed among the workers but is not shared in the training process due to privacy and security concerns. We show that it is possible to achieve equivalent learning performance as if the data is available in a single place.
arXiv Detail & Related papers (2020-09-29T13:08:12Z)
Private Weighted Random Walk Stochastic Gradient Descent [21.23973736418492]
We consider a decentralized learning setting in which data is distributed over nodes in a graph. We propose two algorithms based on two types of random walks that achieve, in a decentralized way, uniform sampling and importance sampling of the data.
arXiv Detail & Related papers (2020-09-03T16:52:13Z)
Faster Secure Data Mining via Distributed Homomorphic Encryption [108.77460689459247]
Homomorphic Encryption (HE) is receiving more and more attention recently for its capability to do computations over the encrypted field. We propose a novel general distributed HE-based data mining framework towards one step of solving the scaling problem. We verify the efficiency and effectiveness of our new framework by testing over various data mining algorithms and benchmark data-sets.
arXiv Detail & Related papers (2020-06-17T18:14:30Z)
Quantized Decentralized Stochastic Learning over Directed Graphs [52.94011236627326]
We consider a decentralized learning problem where data points are distributed among computing nodes communicating over a directed graph. As the model size gets large, decentralized learning faces a major bottleneck that is the communication load due to each node transmitting messages (model updates) to its neighbors. We propose the quantized decentralized learning algorithm over directed graphs that is based on the push-sum algorithm in decentralized consensus optimization.
arXiv Detail & Related papers (2020-02-23T18:25:39Z)
Overlap Local-SGD: An Algorithmic Approach to Hide Communication Delays in Distributed SGD [32.03967072200476]
We propose an algorithmic approach named OverlapLocal-Local-Local-SGD (Local momentum variant) We achieve this by adding an anchor model on each node. After multiple local updates, locally trained models will be pulled back towards the anchor model rather than communicating with others.
arXiv Detail & Related papers (2020-02-21T20:33:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.