Related papers: DASA: Delay-Adaptive Multi-Agent Stochastic Approximation

DASA: Delay-Adaptive Multi-Agent Stochastic Approximation

URL: http://arxiv.org/abs/2403.17247v3
Date: Fri, 2 Aug 2024 09:03:09 GMT
Title: DASA: Delay-Adaptive Multi-Agent Stochastic Approximation
Authors: Nicolò Dal Fabbro, Arman Adibi, H. Vincent Poor, Sanjeev R. Kulkarni, Aritra Mitra, George J. Pappas,
Abstract summary: We consider a setting in which $N$ agents aim to speedup a common Approximation problem by acting in parallel and communicating with a central server. To mitigate the effect of delays and stragglers, we propose textttDASA, a Delay-Adaptive algorithm for multi-agent Approximation.
Score: 64.32538247395627
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We consider a setting in which $N$ agents aim to speedup a common Stochastic Approximation (SA) problem by acting in parallel and communicating with a central server. We assume that the up-link transmissions to the server are subject to asynchronous and potentially unbounded time-varying delays. To mitigate the effect of delays and stragglers while reaping the benefits of distributed computation, we propose \texttt{DASA}, a Delay-Adaptive algorithm for multi-agent Stochastic Approximation. We provide a finite-time analysis of \texttt{DASA} assuming that the agents' stochastic observation processes are independent Markov chains. Significantly advancing existing results, \texttt{DASA} is the first algorithm whose convergence rate depends only on the mixing time $\tau_{mix}$ and on the average delay $\tau_{avg}$ while jointly achieving an $N$-fold convergence speedup under Markovian sampling. Our work is relevant for various SA applications, including multi-agent and distributed temporal difference (TD) learning, Q-learning and stochastic optimization with correlated data.

Related papers

Achieving Tighter Finite-Time Rates for Heterogeneous Federated Stochastic Approximation under Markovian Sampling [6.549288471493216]
We study a generic federated approximation problem involving $M$ agents. The goal is for the agents to communicate intermittently via a server to find the root of the average of the agents' local operators. We develop a novel algorithm titled texttFedHSA, and prove that it guarantees convergence to the correct point.
arXiv Detail & Related papers (2025-04-15T22:13:55Z)
A Stochastic Approximation Approach for Efficient Decentralized Optimization on Random Networks [21.66341372216097]
A challenging problem in decentralized optimization is to develop algorithms with fast convergence on random time topologies under unreliable bandwidth-constrained communication network.<n>This paper introduces a novel approximation approach with a Fully Primal Dual Algorithm (FSPDA) framework.<n> Numerical experiments show the benefits of the FSPDA algorithms.
arXiv Detail & Related papers (2024-10-24T14:26:58Z)
Stochastic Approximation with Delayed Updates: Finite-Time Rates under Markovian Sampling [73.5602474095954]
We study the non-asymptotic performance of approximation schemes with delayed updates under Markovian sampling. Our theoretical findings shed light on the finite-time effects of delays for a broad class of algorithms.
arXiv Detail & Related papers (2024-02-19T03:08:02Z)
Federated Learning Using Variance Reduced Stochastic Gradient for Probabilistically Activated Agents [0.0]
This paper proposes an algorithm for Federated Learning (FL) with a two-layer structure that achieves both variance reduction and a faster convergence rate to an optimal solution in the setting where each agent has an arbitrary probability of selection in each iteration.
arXiv Detail & Related papers (2022-10-25T22:04:49Z)
Sharper Convergence Guarantees for Asynchronous SGD for Distributed and Federated Learning [77.22019100456595]
We show a training algorithm for distributed computation workers with varying communication frequency. In this work, we obtain a tighter convergence rate of $mathcalO!!!(sigma2-2_avg!! . We also show that the heterogeneity term in rate is affected by the average delay within each worker.
arXiv Detail & Related papers (2022-06-16T17:10:57Z)
Distributed gradient-based optimization in the presence of dependent aperiodic communication [4.34720256795424]
Iterative distributed optimization algorithms involve multiple agents that communicate with each other, over time, in order to minimize/maximize a global objective. In the presence of unreliable communication networks, the Age-of-Information (AoI), which measures the freshness of data received, may be large and hence hinder algorithmic convergence. We show that convergence is guaranteed provided the random variables associated with the AoI processes areally dominated by a random variable with finite first moment.
arXiv Detail & Related papers (2022-01-27T06:44:04Z)
Acceleration in Distributed Optimization Under Similarity [72.54787082152278]
We study distributed (strongly convex) optimization problems over a network of agents, with no centralized nodes. An $varepsilon$-solution is achieved in $tildemathcalrhoObig(sqrtfracbeta/mu (1-)log1/varepsilonbig)$ number of communications steps. This rate matches (up to poly-log factors) for the first time lower complexity communication bounds of distributed gossip-algorithms applied to the class of problems of interest.
arXiv Detail & Related papers (2021-10-24T04:03:00Z)
Distributed stochastic optimization with large delays [59.95552973784946]
One of the most widely used methods for solving large-scale optimization problems is distributed asynchronous gradient descent (DASGD) We show that DASGD converges to a global optimal implementation model under same delay assumptions.
arXiv Detail & Related papers (2021-07-06T21:59:49Z)
Asynchronous Stochastic Optimization Robust to Arbitrary Delays [54.61797739710608]
We consider optimization with delayed gradients where, at each time stept$, the algorithm makes an update using a stale computation - d_t$ for arbitrary delay $d_t gradient. Our experiments demonstrate the efficacy and robustness of our algorithm in cases where the delay distribution is skewed or heavy-tailed.
arXiv Detail & Related papers (2021-06-22T15:50:45Z)
Decentralized Optimization with Heterogeneous Delays: a Continuous-Time Approach [6.187780920448871]
We propose a novel continuous-time framework to analyze asynchronous algorithms. We describe a fully asynchronous decentralized algorithm to minimize the sum of smooth and strongly convex functions.
arXiv Detail & Related papers (2021-06-07T13:09:25Z)
Accelerating Distributed SGD for Linear Regression using Iterative Pre-Conditioning [0.966840768820136]
Iteratively Pre-conditioned Gradient-descent (IPSG) method shown to converge faster than other existing distributed algorithms. IPSG method's convergence rate compares favorably to prominent algorithms for solving the linear least-squares problem in server-based networks.
arXiv Detail & Related papers (2020-11-15T18:09:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.