The Sample-Communication Complexity Trade-off in Federated Q-Learning
- URL: http://arxiv.org/abs/2408.16981v2
- Date: Tue, 29 Oct 2024 20:37:04 GMT
- Title: The Sample-Communication Complexity Trade-off in Federated Q-Learning
- Authors: Sudeep Salgia, Yuejie Chi,
- Abstract summary: We investigate the trade-off between sample and communication complexities for the widely used class of intermittent communication algorithms.
We propose a new algorithm, called Fed-DVR-Q, which is the first federated Q-learning algorithm to simultaneously achieve order-optimal sample and communication complexities.
- Score: 31.644851830271755
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of federated Q-learning, where $M$ agents aim to collaboratively learn the optimal Q-function of an unknown infinite-horizon Markov decision process with finite state and action spaces. We investigate the trade-off between sample and communication complexities for the widely used class of intermittent communication algorithms. We first establish the converse result, where it is shown that a federated Q-learning algorithm that offers any speedup with respect to the number of agents in the per-agent sample complexity needs to incur a communication cost of at least an order of $\frac{1}{1-\gamma}$ up to logarithmic factors, where $\gamma$ is the discount factor. We also propose a new algorithm, called Fed-DVR-Q, which is the first federated Q-learning algorithm to simultaneously achieve order-optimal sample and communication complexities. Thus, together these results provide a complete characterization of the sample-communication complexity trade-off in federated Q-learning.
Related papers
- Federated Q-Learning with Reference-Advantage Decomposition: Almost Optimal Regret and Logarithmic Communication Cost [4.895986534376972]
We propose a novel model-free federated Q-learning algorithm, termed FedQ-Advantage.
Our algorithm not only requires a lower logarithmic communication cost but also achieves an almost optimal regret, reaching the information bound up to a logarithmic factor and near-linear regret speedup compared to its single-agent counterpart when the time horizon is sufficiently large.
arXiv Detail & Related papers (2024-05-29T06:26:52Z) - On the Necessity of Collaboration for Online Model Selection with Decentralized Data [53.244188985271606]
We consider online model selection with decentralized data over $M$ clients, and study the necessity of collaboration among clients.
Our results show (i) collaboration is unnecessary in the absence of computational constraints on clients; (ii) collaboration is necessary if the computational cost on each client is limited to $o(K)$, where $K$ is the number of candidate hypothesis spaces.
arXiv Detail & Related papers (2024-04-15T06:32:28Z) - Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity [59.57065228857247]
Retrieval-augmented Large Language Models (LLMs) have emerged as a promising approach to enhancing response accuracy in several tasks, such as Question-Answering (QA)
We propose a novel adaptive QA framework, that can dynamically select the most suitable strategy for (retrieval-augmented) LLMs based on the query complexity.
We validate our model on a set of open-domain QA datasets, covering multiple query complexities, and show that ours enhances the overall efficiency and accuracy of QA systems.
arXiv Detail & Related papers (2024-03-21T13:52:30Z) - Multi-Timescale Ensemble Q-learning for Markov Decision Process Policy
Optimization [21.30645601474163]
Original Q-learning suffers from performance and complexity challenges across very large networks.
New model-free ensemble reinforcement learning algorithm which adapts the classical Q-learning is proposed to handle these challenges.
Numerical results show that the proposed algorithm can achieve up to 55% less average policy error with up to 50% less runtime complexity.
arXiv Detail & Related papers (2024-02-08T08:08:23Z) - Federated Q-Learning: Linear Regret Speedup with Low Communication Cost [4.380110270510058]
We propose two federated Q-Learning algorithms termed as FedQ-Hoeffding and FedQ-Bernstein.
We show that the corresponding total regrets achieve a linear speedup compared with their single-agent counterparts when the time horizon is sufficiently large.
Those results rely on an event-triggered synchronization mechanism between the agents and the server.
arXiv Detail & Related papers (2023-12-22T19:14:09Z) - Pointer Networks with Q-Learning for Combinatorial Optimization [55.2480439325792]
We introduce the Pointer Q-Network (PQN), a hybrid neural architecture that integrates model-free Q-value policy approximation with Pointer Networks (Ptr-Nets)
Our empirical results demonstrate the efficacy of this approach, also testing the model in unstable environments.
arXiv Detail & Related papers (2023-11-05T12:03:58Z) - The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup
and Beyond [44.43850105124659]
We consider federated Q-learning, which aims to learn an optimal Q-function by periodically aggregating local Q-estimates trained on local data alone.
We provide sample complexity guarantees for both the synchronous and asynchronous variants of federated Q-learning.
We propose a novel federated Q-learning algorithm with importance averaging, giving larger weights to more frequently visited state-action pairs.
arXiv Detail & Related papers (2023-05-18T04:18:59Z) - Cooperative Multi-Agent Reinforcement Learning: Asynchronous
Communication and Linear Function Approximation [77.09836892653176]
We study multi-agent reinforcement learning in the setting of episodic Markov decision processes.
We propose a provably efficient algorithm based on value that enable asynchronous communication.
We show that a minimal $Omega(dM)$ communication complexity is required to improve the performance through collaboration.
arXiv Detail & Related papers (2023-05-10T20:29:29Z) - Communication-Efficient Federated Bilevel Optimization with Local and
Global Lower Level Problems [118.00379425831566]
We propose a communication-efficient algorithm, named FedBiOAcc.
We prove that FedBiOAcc-Local converges at the same rate for this type of problems.
Empirical results show superior performance of our algorithms.
arXiv Detail & Related papers (2023-02-13T21:28:53Z) - INTERACT: Achieving Low Sample and Communication Complexities in
Decentralized Bilevel Learning over Networks [24.02913189682224]
Decentralized bilevel optimization problems have received increasing attention in the networking machine learning communities.
Low sample and communication complexities are two fundamental challenges that remain under-explored.
Our work is the first that both low sample and communication complexities for solving decentralized bilevel optimization problems over networks.
arXiv Detail & Related papers (2022-07-27T04:19:28Z) - Tightening the Dependence on Horizon in the Sample Complexity of
Q-Learning [59.71676469100807]
This work sharpens the sample complexity of synchronous Q-learning to an order of $frac|mathcalS|| (1-gamma)4varepsilon2$ for any $0varepsilon 1$.
Our finding unveils the effectiveness of vanilla Q-learning, which matches that of speedy Q-learning without requiring extra computation and storage.
arXiv Detail & Related papers (2021-02-12T14:22:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.