Federated Reinforcement Learning: Linear Speedup Under Markovian
Sampling
- URL: http://arxiv.org/abs/2206.10185v1
- Date: Tue, 21 Jun 2022 08:39:12 GMT
- Title: Federated Reinforcement Learning: Linear Speedup Under Markovian
Sampling
- Authors: Sajad Khodadadian, Pranay Sharma, Gauri Joshi, Siva Theja Maguluri
- Abstract summary: We consider a federated reinforcement learning framework where multiple agents collaboratively learn a global model.
We propose federated versions of on-policy TD, off-policy TD and Q-learning, and analyze their convergence.
We are the first to consider Markovian noise and multiple local updates, and prove a linear convergence speedup with respect to the number of agents.
- Score: 17.943014287720395
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Since reinforcement learning algorithms are notoriously data-intensive, the
task of sampling observations from the environment is usually split across
multiple agents. However, transferring these observations from the agents to a
central location can be prohibitively expensive in terms of the communication
cost, and it can also compromise the privacy of each agent's local behavior
policy. In this paper, we consider a federated reinforcement learning framework
where multiple agents collaboratively learn a global model, without sharing
their individual data and policies. Each agent maintains a local copy of the
model and updates it using locally sampled data. Although having N agents
enables the sampling of N times more data, it is not clear if it leads to
proportional convergence speedup. We propose federated versions of on-policy
TD, off-policy TD and Q-learning, and analyze their convergence. For all these
algorithms, to the best of our knowledge, we are the first to consider
Markovian noise and multiple local updates, and prove a linear convergence
speedup with respect to the number of agents. To obtain these results, we show
that federated TD and Q-learning are special cases of a general framework for
federated stochastic approximation with Markovian noise, and we leverage this
framework to provide a unified convergence analysis that applies to all the
algorithms.
Related papers
- Achieving Tighter Finite-Time Rates for Heterogeneous Federated Stochastic Approximation under Markovian Sampling [6.549288471493216]
We study a generic federated approximation problem involving $M$ agents.
The goal is for the agents to communicate intermittently via a server to find the root of the average of the agents' local operators.
We develop a novel algorithm titled texttFedHSA, and prove that it guarantees convergence to the correct point.
arXiv Detail & Related papers (2025-04-15T22:13:55Z) - Self-Localized Collaborative Perception [49.86110931859302]
We propose$mathttCoBEVGlue$, a novel self-localized collaborative perception system.
$mathttCoBEVGlue$ is a novel spatial alignment module, which provides the relative poses between agents.
$mathttCoBEVGlue$ achieves state-of-the-art detection performance under arbitrary localization noises and attacks.
arXiv Detail & Related papers (2024-06-18T15:26:54Z) - FedLPA: One-shot Federated Learning with Layer-Wise Posterior Aggregation [7.052566906745796]
FedLPA is a layer-wise posterior aggregation method for federated learning.
We show that FedLPA significantly improves learning performance over state-of-the-art methods across several metrics.
arXiv Detail & Related papers (2023-09-30T10:51:27Z) - Rethinking Client Drift in Federated Learning: A Logit Perspective [125.35844582366441]
Federated Learning (FL) enables multiple clients to collaboratively learn in a distributed way, allowing for privacy protection.
We find that the difference in logits between the local and global models increases as the model is continuously updated.
We propose a new algorithm, named FedCSD, a Class prototype Similarity Distillation in a federated framework to align the local and global models.
arXiv Detail & Related papers (2023-08-20T04:41:01Z) - On the Convergence of Heterogeneous Federated Learning with Arbitrary
Adaptive Online Model Pruning [15.300983585090794]
We present a unifying framework for heterogeneous FL algorithms with em arbitrary adaptive online model pruning.
In particular, we prove that under certain sufficient conditions, these algorithms converge to a stationary point of standard FL for general smooth cost functions.
We illuminate two key factors impacting convergence: pruning-induced noise and minimum coverage index.
arXiv Detail & Related papers (2022-01-27T20:43:38Z) - Convergence Rates of Average-Reward Multi-agent Reinforcement Learning
via Randomized Linear Programming [41.30044824711509]
We focus on the case that the global reward is a sum of local rewards, the joint policy factorizes into agents' marginals, and full state observability.
We develop multi-agent extensions, whereby agents solve their local saddle point problems and then perform local weighted averaging.
We establish that the sample complexity to obtain near-globally optimal solutions matches tight dependencies on the cardinality of the state and action spaces.
arXiv Detail & Related papers (2021-10-22T03:48:41Z) - Dimension-Free Rates for Natural Policy Gradient in Multi-Agent
Reinforcement Learning [22.310861786709538]
We propose a scalable algorithm for cooperative multi-agent reinforcement learning.
We show that our algorithm converges to the globally optimal policy with a dimension-free statistical and computational complexity.
arXiv Detail & Related papers (2021-09-23T23:38:15Z) - Locality Matters: A Scalable Value Decomposition Approach for
Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents.
We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z) - Learning Connectivity for Data Distribution in Robot Teams [96.39864514115136]
We propose a task-agnostic, decentralized, low-latency method for data distribution in ad-hoc networks using Graph Neural Networks (GNN)
Our approach enables multi-agent algorithms based on global state information to function by ensuring it is available at each robot.
We train the distributed GNN communication policies via reinforcement learning using the average Age of Information as the reward function and show that it improves training stability compared to task-specific reward functions.
arXiv Detail & Related papers (2021-03-08T21:48:55Z) - Exploiting Shared Representations for Personalized Federated Learning [54.65133770989836]
We propose a novel federated learning framework and algorithm for learning a shared data representation across clients and unique local heads for each client.
Our algorithm harnesses the distributed computational power across clients to perform many local-updates with respect to the low-dimensional local parameters for every update of the representation.
This result is of interest beyond federated learning to a broad class of problems in which we aim to learn a shared low-dimensional representation among data distributions.
arXiv Detail & Related papers (2021-02-14T05:36:25Z) - Multi-Agent Reinforcement Learning in Stochastic Networked Systems [30.78949372661673]
We study multi-agent reinforcement learning (MARL) in a network of agents.
The objective is to find localized policies that maximize the (discounted) global reward.
arXiv Detail & Related papers (2020-06-11T16:08:16Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.