Distributed TD(0) with Almost No Communication
- URL: http://arxiv.org/abs/2104.07855v1
- Date: Fri, 16 Apr 2021 02:21:11 GMT
- Title: Distributed TD(0) with Almost No Communication
- Authors: Rui Liu and Alex Olshevsky
- Abstract summary: We provide a new non-asymptotic analysis of distributed TD(0) with linear function approximation.
Our approach relies on "one-shot averaging," where $N$ agents run local copies of TD(0) and average the outcomes only once at the very end.
- Score: 13.578454059496847
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We provide a new non-asymptotic analysis of distributed TD(0) with linear
function approximation. Our approach relies on "one-shot averaging," where $N$
agents run local copies of TD(0) and average the outcomes only once at the very
end. We consider two models: one in which the agents interact with an
environment they can observe and whose transitions depends on all of their
actions (which we call the global state model), and one in which each agent can
run a local copy of an identical Markov Decision Process, which we call the
local state model.
In the global state model, we show that the convergence rate of our
distributed one-shot averaging method matches the known convergence rate of
TD(0). By contrast, the best convergence rate in the previous literature showed
a rate which, in the worst case, underperformed the non-distributed version by
$O(N^3)$ in terms of the number of agents $N$. In the local state model, we
demonstrate a version of the linear time speedup phenomenon, where the
convergence time of the distributed process is a factor of $N$ faster than the
convergence time of TD(0). As far as we are aware, this is the first result
rigorously showing benefits from parallelism for temporal difference methods.
Related papers
- DASA: Delay-Adaptive Multi-Agent Stochastic Approximation [64.32538247395627]
We consider a setting in which $N$ agents aim to speedup a common Approximation problem by acting in parallel and communicating with a central server.
To mitigate the effect of delays and stragglers, we propose textttDASA, a Delay-Adaptive algorithm for multi-agent Approximation.
arXiv Detail & Related papers (2024-03-25T22:49:56Z) - Convergence Analysis of Decentralized ASGD [1.8710230264817358]
We present a novel convergence-rate analysis for decentralized asynchronous SGD (DASGD) which does not require partial synchronization among nodes nor restrictive network topologies.
Our convergence proof holds for a fixed stepsize and any nonsmooth, homogeneous, L-shaped objective function.
arXiv Detail & Related papers (2023-09-07T14:50:31Z) - Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative
Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models.
In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z) - Distributed TD(0) with Almost No Communication [15.321579527891457]
We provide a new non-asymptotic analysis of distributed temporal difference learning with linear function approximation.
We demonstrate a version of the linear time speedup phenomenon, where the convergence time of the distributed process is a factor of $N$ faster than the convergence time of TD(0).
arXiv Detail & Related papers (2023-05-25T17:00:46Z) - Sharper Convergence Guarantees for Asynchronous SGD for Distributed and
Federated Learning [77.22019100456595]
We show a training algorithm for distributed computation workers with varying communication frequency.
In this work, we obtain a tighter convergence rate of $mathcalO!!!(sigma2-2_avg!! .
We also show that the heterogeneity term in rate is affected by the average delay within each worker.
arXiv Detail & Related papers (2022-06-16T17:10:57Z) - Cooperative Online Learning in Stochastic and Adversarial MDPs [50.62439652257712]
We study cooperative online learning in and adversarial Markov decision process (MDP)
In each episode, $m$ agents interact with an MDP simultaneously and share information in order to minimize their individual regret.
We are the first to consider cooperative reinforcement learning (RL) with either non-fresh randomness or in adversarial MDPs.
arXiv Detail & Related papers (2022-01-31T12:32:11Z) - Distributed gradient-based optimization in the presence of dependent
aperiodic communication [4.34720256795424]
Iterative distributed optimization algorithms involve multiple agents that communicate with each other, over time, in order to minimize/maximize a global objective.
In the presence of unreliable communication networks, the Age-of-Information (AoI), which measures the freshness of data received, may be large and hence hinder algorithmic convergence.
We show that convergence is guaranteed provided the random variables associated with the AoI processes areally dominated by a random variable with finite first moment.
arXiv Detail & Related papers (2022-01-27T06:44:04Z) - A Law of Iterated Logarithm for Multi-Agent Reinforcement Learning [3.655021726150368]
In Multi-Agent Reinforcement Learning (MARL), multiple agents interact with a common environment, as also with each other, for solving a shared problem in sequential decision-making.
We derive a novel law of iterated for a family of distributed nonlinear approximation schemes that is useful in MARL.
arXiv Detail & Related papers (2021-10-27T08:01:17Z) - Fast Rates for the Regret of Offline Reinforcement Learning [69.23654172273085]
We study the regret of reinforcement learning from offline data generated by a fixed behavior policy in an infinitehorizon discounted decision process (MDP)
We show that given any estimate for the optimal quality function $Q*$, the regret of the policy it defines converges at a rate given by the exponentiation of the $Q*$-estimate's pointwise convergence rate.
arXiv Detail & Related papers (2021-01-31T16:17:56Z) - A Unified Theory of Decentralized SGD with Changing Topology and Local
Updates [70.9701218475002]
We introduce a unified convergence analysis of decentralized communication methods.
We derive universal convergence rates for several applications.
Our proofs rely on weak assumptions.
arXiv Detail & Related papers (2020-03-23T17:49:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.