Fully Asynchronous Policy Evaluation in Distributed Reinforcement
Learning over Networks
- URL: http://arxiv.org/abs/2003.00433v3
- Date: Fri, 22 Jan 2021 16:45:28 GMT
- Title: Fully Asynchronous Policy Evaluation in Distributed Reinforcement
Learning over Networks
- Authors: Xingyu Sha, Jiaqi Zhang, Keyou You, Kaiqing Zhang and Tamer Ba\c{s}ar
- Abstract summary: This paper proposes a emphfully asynchronous scheme for the policy evaluation problem of distributed reinforcement learning (DisRL) over directed peer-to-peer networks.
Without waiting for any other node of the network, each node can locally update its value function at any time by using (possibly delayed) information from its neighbors.
- Score: 14.636457985379746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a \emph{fully asynchronous} scheme for the policy
evaluation problem of distributed reinforcement learning (DisRL) over directed
peer-to-peer networks. Without waiting for any other node of the network, each
node can locally update its value function at any time by using (possibly
delayed) information from its neighbors. This is in sharp contrast to the
gossip-based scheme where a pair of nodes concurrently update. Though the fully
asynchronous setting involves a difficult multi-timescale decision problem, we
design a novel stochastic average gradient (SAG) based distributed algorithm
and develop a push-pull augmented graph approach to prove its exact convergence
at a linear rate of $\mathcal{O}(c^k)$ where $c\in(0,1)$ and $k$ increases by
one no matter on which node updates. Finally, numerical experiments validate
that our method speeds up linearly with respect to the number of nodes, and is
robust to straggler nodes.
Related papers
- Sparse Decomposition of Graph Neural Networks [20.768412002413843]
We propose an approach to reduce the number of nodes that are included during aggregation.
We achieve this through a sparse decomposition, learning to approximate node representations using a weighted sum of linearly transformed features.
We demonstrate via extensive experiments that our method outperforms other baselines designed for inference speedup.
arXiv Detail & Related papers (2024-10-25T17:52:16Z) - A Learning Based Scheme for Fair Timeliness in Sparse Gossip Networks [41.53961199878831]
We consider a gossip network, consisting of $n$ nodes, which tracks the information at a source.
The source updates its information with a Poisson arrival process and also sends updates to the nodes in the network.
Some nodes are able to track the source very timely, whereas, some nodes fall behind versions quite often.
arXiv Detail & Related papers (2023-10-02T17:55:17Z) - NodeFormer: A Scalable Graph Structure Learning Transformer for Node
Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes.
The efficient computation is enabled by a kernerlized Gumbel-Softmax operator.
Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z) - Temporal Aggregation and Propagation Graph Neural Networks for Dynamic
Representation [67.26422477327179]
Temporal graphs exhibit dynamic interactions between nodes over continuous time.
We propose a novel method of temporal graph convolution with the whole neighborhood.
Our proposed TAP-GNN outperforms existing temporal graph methods by a large margin in terms of both predictive performance and online inference latency.
arXiv Detail & Related papers (2023-04-15T08:17:18Z) - Online Centralized Non-parametric Change-point Detection via Graph-based
Likelihood-ratio Estimation [77.81487285123147]
Consider each node of a graph to be generating a data stream that is synchronized and observed at near real-time.
At a change-point $tau$, a change occurs at a subset of nodes $C$, which affects the probability distribution of their associated node streams.
We propose a novel kernel-based method to both detect $tau$ and localize $C$, based on the direct estimation of the likelihood-ratio between the post-change and the pre-change distributions of the node streams.
arXiv Detail & Related papers (2023-01-08T10:15:24Z) - Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching.
Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z) - ResNorm: Tackling Long-tailed Degree Distribution Issue in Graph Neural
Networks via Normalization [80.90206641975375]
This paper focuses on improving the performance of GNNs via normalization.
By studying the long-tailed distribution of node degrees in the graph, we propose a novel normalization method for GNNs.
The $scale$ operation of ResNorm reshapes the node-wise standard deviation (NStd) distribution so as to improve the accuracy of tail nodes.
arXiv Detail & Related papers (2022-06-16T13:49:09Z) - Decentralized Multi-Task Stochastic Optimization With Compressed
Communications [22.31884634659446]
The paper develops algorithms and obtains performance bounds for two different models of local information availability at the nodes.
We show that deviation from the global minimum value and violations of the constraints are upper-bounded by $mathcalO(T-frac12)$ and $mathcalO(T-frac14)$.
arXiv Detail & Related papers (2021-12-23T05:54:42Z) - Distributed stochastic optimization with large delays [59.95552973784946]
One of the most widely used methods for solving large-scale optimization problems is distributed asynchronous gradient descent (DASGD)
We show that DASGD converges to a global optimal implementation model under same delay assumptions.
arXiv Detail & Related papers (2021-07-06T21:59:49Z) - A hybrid variance-reduced method for decentralized stochastic non-convex
optimization [15.447966950703947]
textttGTHSGD algorithm specialized local hybrid gradient implements the network to track the global gradient.
textttGTHSGD achieves a network complexity of$O(n-1)$ when the required error tolerance$epsilon$ is small enough.
arXiv Detail & Related papers (2021-02-12T20:13:05Z) - On the Communication Latency of Wireless Decentralized Learning [12.977865337365856]
We consider a wireless network comprising $n$ nodes located within a circular area of radius $R$.
To enable gradient exchanges across the network, we assume each node communicates only with a set of neighboring nodes.
We show that the communication delay for a single round of exchanging gradients on all the links throughout the network scales as $mathcalOleft(fracn2-3betabetalog nright)$.
arXiv Detail & Related papers (2020-02-10T20:10:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.