The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup
and Beyond
- URL: http://arxiv.org/abs/2305.10697v2
- Date: Tue, 12 Dec 2023 21:47:08 GMT
- Title: The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup
and Beyond
- Authors: Jiin Woo, Gauri Joshi, Yuejie Chi
- Abstract summary: We consider federated Q-learning, which aims to learn an optimal Q-function by periodically aggregating local Q-estimates trained on local data alone.
We provide sample complexity guarantees for both the synchronous and asynchronous variants of federated Q-learning.
We propose a novel federated Q-learning algorithm with importance averaging, giving larger weights to more frequently visited state-action pairs.
- Score: 44.43850105124659
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When the data used for reinforcement learning (RL) are collected by multiple
agents in a distributed manner, federated versions of RL algorithms allow
collaborative learning without the need for agents to share their local data.
In this paper, we consider federated Q-learning, which aims to learn an optimal
Q-function by periodically aggregating local Q-estimates trained on local data
alone. Focusing on infinite-horizon tabular Markov decision processes, we
provide sample complexity guarantees for both the synchronous and asynchronous
variants of federated Q-learning. In both cases, our bounds exhibit a linear
speedup with respect to the number of agents and near-optimal dependencies on
other salient problem parameters.
In the asynchronous setting, existing analyses of federated Q-learning, which
adopt an equally weighted averaging of local Q-estimates, require that every
agent covers the entire state-action space. In contrast, our improved sample
complexity scales inverse proportionally to the minimum entry of the average
stationary state-action occupancy distribution of all agents, thus only
requiring the agents to collectively cover the entire state-action space,
unveiling the blessing of heterogeneity in enabling collaborative learning by
relaxing the coverage requirement of the single-agent case. However, its sample
complexity still suffers when the local trajectories are highly heterogeneous.
In response, we propose a novel federated Q-learning algorithm with importance
averaging, giving larger weights to more frequently visited state-action pairs,
which achieves a robust linear speedup as if all trajectories are centrally
processed, regardless of the heterogeneity of local behavior policies.
Related papers
- Federated Control in Markov Decision Processes [23.086904790247576]
We study problems of federated control in Markov Decision Processes.
We propose a novel communication protocol that periodically aggregates agents' knowledge of their restricted regions.
Specifically, FedQ-X has been shown to enjoy linear speedup in terms of sample complexity when workload is uniformly distributed among agents.
arXiv Detail & Related papers (2024-05-07T05:59:10Z) - Federated Offline Reinforcement Learning: Collaborative Single-Policy
Coverage Suffices [44.97418712091146]
offline reinforcement learning (RL) seeks to learn an optimal policy using offline data.
This work explores the benefit of federated learning for offline RL, aiming at collaboratively leveraging offline datasets at multiple agents.
FedLCB-Q is a variant of the popular model-free Q-learning algorithm tailored for federated offline RL.
arXiv Detail & Related papers (2024-02-08T18:09:17Z) - Federated Q-Learning: Linear Regret Speedup with Low Communication Cost [4.380110270510058]
We propose two federated Q-Learning algorithms termed as FedQ-Hoeffding and FedQ-Bernstein.
We show that the corresponding total regrets achieve a linear speedup compared with their single-agent counterparts when the time horizon is sufficiently large.
Those results rely on an event-triggered synchronization mechanism between the agents and the server.
arXiv Detail & Related papers (2023-12-22T19:14:09Z) - On the Convergence of Heterogeneous Federated Learning with Arbitrary
Adaptive Online Model Pruning [15.300983585090794]
We present a unifying framework for heterogeneous FL algorithms with em arbitrary adaptive online model pruning.
In particular, we prove that under certain sufficient conditions, these algorithms converge to a stationary point of standard FL for general smooth cost functions.
We illuminate two key factors impacting convergence: pruning-induced noise and minimum coverage index.
arXiv Detail & Related papers (2022-01-27T20:43:38Z) - Convergence Rates of Average-Reward Multi-agent Reinforcement Learning
via Randomized Linear Programming [41.30044824711509]
We focus on the case that the global reward is a sum of local rewards, the joint policy factorizes into agents' marginals, and full state observability.
We develop multi-agent extensions, whereby agents solve their local saddle point problems and then perform local weighted averaging.
We establish that the sample complexity to obtain near-globally optimal solutions matches tight dependencies on the cardinality of the state and action spaces.
arXiv Detail & Related papers (2021-10-22T03:48:41Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - Straggler-Resilient Federated Learning: Leveraging the Interplay Between
Statistical Accuracy and System Heterogeneity [57.275753974812666]
Federated learning involves learning from data samples distributed across a network of clients while the data remains local.
In this paper, we propose a novel straggler-resilient federated learning method that incorporates statistical characteristics of the clients' data to adaptively select the clients in order to speed up the learning procedure.
arXiv Detail & Related papers (2020-12-28T19:21:14Z) - Distributed Q-Learning with State Tracking for Multi-agent Networked
Control [61.63442612938345]
This paper studies distributed Q-learning for Linear Quadratic Regulator (LQR) in a multi-agent network.
We devise a state tracking (ST) based Q-learning algorithm to design optimal controllers for agents.
arXiv Detail & Related papers (2020-12-22T22:03:49Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.