DASA: Delay-Adaptive Multi-Agent Stochastic Approximation
- URL: http://arxiv.org/abs/2403.17247v3
- Date: Fri, 2 Aug 2024 09:03:09 GMT
- Title: DASA: Delay-Adaptive Multi-Agent Stochastic Approximation
- Authors: Nicolò Dal Fabbro, Arman Adibi, H. Vincent Poor, Sanjeev R. Kulkarni, Aritra Mitra, George J. Pappas,
- Abstract summary: We consider a setting in which $N$ agents aim to speedup a common Approximation problem by acting in parallel and communicating with a central server.
To mitigate the effect of delays and stragglers, we propose textttDASA, a Delay-Adaptive algorithm for multi-agent Approximation.
- Score: 64.32538247395627
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider a setting in which $N$ agents aim to speedup a common Stochastic Approximation (SA) problem by acting in parallel and communicating with a central server. We assume that the up-link transmissions to the server are subject to asynchronous and potentially unbounded time-varying delays. To mitigate the effect of delays and stragglers while reaping the benefits of distributed computation, we propose \texttt{DASA}, a Delay-Adaptive algorithm for multi-agent Stochastic Approximation. We provide a finite-time analysis of \texttt{DASA} assuming that the agents' stochastic observation processes are independent Markov chains. Significantly advancing existing results, \texttt{DASA} is the first algorithm whose convergence rate depends only on the mixing time $\tau_{mix}$ and on the average delay $\tau_{avg}$ while jointly achieving an $N$-fold convergence speedup under Markovian sampling. Our work is relevant for various SA applications, including multi-agent and distributed temporal difference (TD) learning, Q-learning and stochastic optimization with correlated data.
Related papers
- Decentralized Online Convex Optimization with Unknown Feedback Delays [12.543159785477952]
In this paper, we study D-OCO under unknown, time-and agent-varying feedback delays.<n>Existing algorithms assume prior knowledge of the total delay over agents and still suffer from suboptimal dependence on both the delay and network parameters.<n>We propose a novel algorithm that achieves an improved regret bound of O N $sqrt$ d tot + N $sqrt$ T (1-$$2) 1/4.
arXiv Detail & Related papers (2026-01-12T12:59:01Z) - Ringleader ASGD: The First Asynchronous SGD with Optimal Time Complexity under Data Heterogeneity [51.56484100374058]
We introduce Ringleader ASGD, the first asynchronous algorithm that attains the theoretical lower bounds for parallel computation.<n>Our analysis further establishes that Ringleader ASGD remains optimal under arbitrary gradient and even time-varying speeds.
arXiv Detail & Related papers (2025-09-26T19:19:15Z) - Asynchronous and Stochastic Distributed Resource Allocation [27.163306014960515]
We consider a distributed system with multiple workers and a coordinating server with heterogeneous computation and communication times.<n>We explore an approximate primal-dual approach with the aim of adhering to the resource budget constraints.<n>We prove its convergence in the second moment to the saddle point solution of the approximate problem.
arXiv Detail & Related papers (2025-09-01T06:47:23Z) - Achieving Tighter Finite-Time Rates for Heterogeneous Federated Stochastic Approximation under Markovian Sampling [6.549288471493216]
We study a generic federated approximation problem involving $M$ agents.
The goal is for the agents to communicate intermittently via a server to find the root of the average of the agents' local operators.
We develop a novel algorithm titled texttFedHSA, and prove that it guarantees convergence to the correct point.
arXiv Detail & Related papers (2025-04-15T22:13:55Z) - Beyond likelihood ratio bias: Nested multi-time-scale stochastic approximation for likelihood-free parameter estimation [49.78792404811239]
We study inference in simulation-based models where the analytical form of the likelihood is unknown.<n>We use a ratio-free nested multi-time-scale approximation (SA) method that simultaneously tracks the score and drives the parameter update.<n>We show that our algorithm can eliminate the original bias $Obig(sqrtfrac1Nbig)$ and accelerate the convergence rate from $Obig(beta_k+sqrtfracalpha_kNbig)$.
arXiv Detail & Related papers (2024-11-20T02:46:15Z) - A Stochastic Approximation Approach for Efficient Decentralized Optimization on Random Networks [21.66341372216097]
A challenging problem in decentralized optimization is to develop algorithms with fast convergence on random time topologies under unreliable bandwidth-constrained communication network.<n>This paper introduces a novel approximation approach with a Fully Primal Dual Algorithm (FSPDA) framework.<n> Numerical experiments show the benefits of the FSPDA algorithms.
arXiv Detail & Related papers (2024-10-24T14:26:58Z) - Stochastic Approximation with Delayed Updates: Finite-Time Rates under Markovian Sampling [73.5602474095954]
We study the non-asymptotic performance of approximation schemes with delayed updates under Markovian sampling.
Our theoretical findings shed light on the finite-time effects of delays for a broad class of algorithms.
arXiv Detail & Related papers (2024-02-19T03:08:02Z) - Federated Learning Using Variance Reduced Stochastic Gradient for
Probabilistically Activated Agents [0.0]
This paper proposes an algorithm for Federated Learning (FL) with a two-layer structure that achieves both variance reduction and a faster convergence rate to an optimal solution in the setting where each agent has an arbitrary probability of selection in each iteration.
arXiv Detail & Related papers (2022-10-25T22:04:49Z) - Sharper Convergence Guarantees for Asynchronous SGD for Distributed and
Federated Learning [77.22019100456595]
We show a training algorithm for distributed computation workers with varying communication frequency.
In this work, we obtain a tighter convergence rate of $mathcalO!!!(sigma2-2_avg!! .
We also show that the heterogeneity term in rate is affected by the average delay within each worker.
arXiv Detail & Related papers (2022-06-16T17:10:57Z) - Distributed gradient-based optimization in the presence of dependent
aperiodic communication [4.34720256795424]
Iterative distributed optimization algorithms involve multiple agents that communicate with each other, over time, in order to minimize/maximize a global objective.
In the presence of unreliable communication networks, the Age-of-Information (AoI), which measures the freshness of data received, may be large and hence hinder algorithmic convergence.
We show that convergence is guaranteed provided the random variables associated with the AoI processes areally dominated by a random variable with finite first moment.
arXiv Detail & Related papers (2022-01-27T06:44:04Z) - Acceleration in Distributed Optimization Under Similarity [72.54787082152278]
We study distributed (strongly convex) optimization problems over a network of agents, with no centralized nodes.
An $varepsilon$-solution is achieved in $tildemathcalrhoObig(sqrtfracbeta/mu (1-)log1/varepsilonbig)$ number of communications steps.
This rate matches (up to poly-log factors) for the first time lower complexity communication bounds of distributed gossip-algorithms applied to the class of problems of interest.
arXiv Detail & Related papers (2021-10-24T04:03:00Z) - Distributed stochastic optimization with large delays [59.95552973784946]
One of the most widely used methods for solving large-scale optimization problems is distributed asynchronous gradient descent (DASGD)
We show that DASGD converges to a global optimal implementation model under same delay assumptions.
arXiv Detail & Related papers (2021-07-06T21:59:49Z) - Asynchronous Stochastic Optimization Robust to Arbitrary Delays [54.61797739710608]
We consider optimization with delayed gradients where, at each time stept$, the algorithm makes an update using a stale computation - d_t$ for arbitrary delay $d_t gradient.
Our experiments demonstrate the efficacy and robustness of our algorithm in cases where the delay distribution is skewed or heavy-tailed.
arXiv Detail & Related papers (2021-06-22T15:50:45Z) - Decentralized Optimization with Heterogeneous Delays: a Continuous-Time
Approach [6.187780920448871]
We propose a novel continuous-time framework to analyze asynchronous algorithms.
We describe a fully asynchronous decentralized algorithm to minimize the sum of smooth and strongly convex functions.
arXiv Detail & Related papers (2021-06-07T13:09:25Z) - Accelerating Distributed SGD for Linear Regression using Iterative
Pre-Conditioning [0.966840768820136]
Iteratively Pre-conditioned Gradient-descent (IPSG) method shown to converge faster than other existing distributed algorithms.
IPSG method's convergence rate compares favorably to prominent algorithms for solving the linear least-squares problem in server-based networks.
arXiv Detail & Related papers (2020-11-15T18:09:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.