Related papers: Federated Control in Markov Decision Processes

Federated Control in Markov Decision Processes

URL: http://arxiv.org/abs/2405.04026v1
Date: Tue, 7 May 2024 05:59:10 GMT
Title: Federated Control in Markov Decision Processes
Authors: Hao Jin, Yang Peng, Liangyu Zhang, Zhihua Zhang,
Abstract summary: We study problems of federated control in Markov Decision Processes. We propose a novel communication protocol that periodically aggregates agents' knowledge of their restricted regions. Specifically, FedQ-X has been shown to enjoy linear speedup in terms of sample complexity when workload is uniformly distributed among agents.
Score: 23.086904790247576
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study problems of federated control in Markov Decision Processes. To solve an MDP with large state space, multiple learning agents are introduced to collaboratively learn its optimal policy without communication of locally collected experience. In our settings, these agents have limited capabilities, which means they are restricted within different regions of the overall state space during the training process. In face of the difference among restricted regions, we firstly introduce concepts of leakage probabilities to understand how such heterogeneity affects the learning process, and then propose a novel communication protocol that we call Federated-Q protocol (FedQ), which periodically aggregates agents' knowledge of their restricted regions and accordingly modifies their learning problems for further training. In terms of theoretical analysis, we justify the correctness of FedQ as a communication protocol, then give a general result on sample complexity of derived algorithms FedQ-X with the RL oracle , and finally conduct a thorough study on the sample complexity of FedQ-SynQ. Specifically, FedQ-X has been shown to enjoy linear speedup in terms of sample complexity when workload is uniformly distributed among agents. Moreover, we carry out experiments in various environments to justify the efficiency of our methods.

Related papers

The Sample-Communication Complexity Trade-off in Federated Q-Learning [31.644851830271755]
We investigate the trade-off between sample and communication complexities for the widely used class of intermittent communication algorithms. We propose a new algorithm, called Fed-DVR-Q, which is the first federated Q-learning algorithm to simultaneously achieve order-optimal sample and communication complexities.
arXiv Detail & Related papers (2024-08-30T03:03:03Z)
Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [94.2860766709971]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents. Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z)
Finite-Time Analysis of On-Policy Heterogeneous Federated Reinforcement Learning [8.632943870358627]
Federated reinforcement learning (FRL) has emerged as a promising paradigm for reducing the sample complexity of reinforcement learning tasks. We introduce FedSARSA, a novel on-policy reinforcement learning scheme equipped with linear function approximation. We show that FedSARSA converges to a policy that is near-optimal for all agents, with the extent of near-optimality proportional to the level of heterogeneity.
arXiv Detail & Related papers (2024-01-27T02:43:45Z)
Federated Q-Learning: Linear Regret Speedup with Low Communication Cost [4.380110270510058]
We propose two federated Q-Learning algorithms termed as FedQ-Hoeffding and FedQ-Bernstein. We show that the corresponding total regrets achieve a linear speedup compared with their single-agent counterparts when the time horizon is sufficiently large. Those results rely on an event-triggered synchronization mechanism between the agents and the server.
arXiv Detail & Related papers (2023-12-22T19:14:09Z)
The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and Beyond [44.43850105124659]
We consider federated Q-learning, which aims to learn an optimal Q-function by periodically aggregating local Q-estimates trained on local data alone. We provide sample complexity guarantees for both the synchronous and asynchronous variants of federated Q-learning. We propose a novel federated Q-learning algorithm with importance averaging, giving larger weights to more frequently visited state-action pairs.
arXiv Detail & Related papers (2023-05-18T04:18:59Z)
Federated TD Learning over Finite-Rate Erasure Channels: Linear Speedup under Markovian Sampling [17.870440210358847]
We study a federated policy evaluation problem where agents communicate via a central aggregator to expedite the evaluation of a common policy. To capture typical communication constraints in FL, we consider finite capacity up-link channels that can drop packets based on a Bernoulli erasure model. Our work is the first to provide a non-asymptotic analysis of their effects in multi-agent and federated reinforcement learning.
arXiv Detail & Related papers (2023-05-14T08:48:02Z)
On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees. We study this question in a general framework for interactive decision making with multiple agents. We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z)
GradSkip: Communication-Accelerated Local Gradient Methods with Better Computational Complexity [54.585248253601314]
We study a class of distributed optimization algorithms that aim to alleviate high communication costs by allowing clients to perform multiple local gradient-type training steps before communication.<n>In particular, we prove that our modified method, GradSkip, converges linearly under the same assumptions and has the same accelerated communication complexity.
arXiv Detail & Related papers (2022-10-28T20:59:06Z)
Provable Reinforcement Learning with a Short-Term Memory [68.00677878812908]
We study a new subclass of POMDPs, whose latent states can be decoded by the most recent history of a short length $m$. In particular, in the rich-observation setting, we develop new algorithms using a novel "moment matching" approach with a sample complexity that scales exponentially. Our results show that a short-term memory suffices for reinforcement learning in these environments.
arXiv Detail & Related papers (2022-02-08T16:39:57Z)
Decentralized Local Stochastic Extra-Gradient for Variational Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices. We make a very general assumption on the computational network that covers the settings of fully decentralized calculations. We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z)
Distributed Q-Learning with State Tracking for Multi-agent Networked Control [61.63442612938345]
This paper studies distributed Q-learning for Linear Quadratic Regulator (LQR) in a multi-agent network. We devise a state tracking (ST) based Q-learning algorithm to design optimal controllers for agents.
arXiv Detail & Related papers (2020-12-22T22:03:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.