Federated Reinforcement Learning in Heterogeneous Environments
- URL: http://arxiv.org/abs/2507.14487v1
- Date: Sat, 19 Jul 2025 05:06:38 GMT
- Title: Federated Reinforcement Learning in Heterogeneous Environments
- Authors: Ukjo Hwang, Songnam Hong,
- Abstract summary: We investigate a Federated Reinforcement Learning with Environment Heterogeneity (FRL-EH) framework, where local environments exhibit statistical heterogeneity.<n>Within this framework, agents collaboratively learn a global policy by aggregating their collective experiences while preserving the privacy of their local trajectories.<n>We present a novel global objective function that ensures robust performance across heterogeneous local environments and their plausible perturbations.<n>We extend FedRQ to environments with continuous state space through the use of expectile loss, addressing the key challenge of minimizing a value function over a continuous subset of the state space.
- Score: 9.944647907864255
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We investigate a Federated Reinforcement Learning with Environment Heterogeneity (FRL-EH) framework, where local environments exhibit statistical heterogeneity. Within this framework, agents collaboratively learn a global policy by aggregating their collective experiences while preserving the privacy of their local trajectories. To better reflect real-world scenarios, we introduce a robust FRL-EH framework by presenting a novel global objective function. This function is specifically designed to optimize a global policy that ensures robust performance across heterogeneous local environments and their plausible perturbations. We propose a tabular FRL algorithm named FedRQ and theoretically prove its asymptotic convergence to an optimal policy for the global objective function. Furthermore, we extend FedRQ to environments with continuous state space through the use of expectile loss, addressing the key challenge of minimizing a value function over a continuous subset of the state space. This advancement facilitates the seamless integration of the principles of FedRQ with various Deep Neural Network (DNN)-based RL algorithms. Extensive empirical evaluations validate the effectiveness and robustness of our FRL algorithms across diverse heterogeneous environments, consistently achieving superior performance over the existing state-of-the-art FRL algorithms.
Related papers
- On Global Convergence Rates for Federated Policy Gradient under Heterogeneous Environment [14.366821866598803]
We introduce b-RS-FedPG, a novel policy gradient method that employs a carefully constructed softmax-inspired parameterization.<n>We demonstrate explicit convergence rates for b-RS-FedPG toward near-optimal stationary policies.
arXiv Detail & Related papers (2025-05-29T14:08:35Z) - Policy Regularization on Globally Accessible States in Cross-Dynamics Reinforcement Learning [53.9544543607396]
We propose a novel framework that integrates reward rendering with Imitation from Observation (IfO)<n>By instantiating F-distance in different ways, we derive two theoretical analysis and develop a practical algorithm called Accessible State Oriented Policy Regularization (ASOR)<n>ASOR serves as a general add-on module that can be incorporated into various approaches RL, including offline RL and off-policy RL.
arXiv Detail & Related papers (2025-03-10T03:50:20Z) - Momentum for the Win: Collaborative Federated Reinforcement Learning across Heterogeneous Environments [17.995517050546244]
We explore a Federated Reinforcement Learning (FRL) problem where $N$ agents collaboratively learn a common policy without sharing their trajectory data.
We propose two algorithms: FedSVRPG-M and FedHAPG-M, which converge to a stationary point of the average performance function.
Our algorithms enjoy linear convergence speedups with respect to the number of agents, highlighting the benefit of collaboration among agents in finding a common policy.
arXiv Detail & Related papers (2024-05-29T20:24:42Z) - Federated Offline Policy Optimization with Dual Regularization [12.320355780707168]
Federated Reinforcement Learning (FRL) has been deemed as a promising solution for intelligent decision-making in the era of Artificial Internet of Things.
Existing FRL approaches often entail repeated interactions with the environment during local updating, which can be prohibitively expensive or even infeasible in many real-world domains.
This paper proposes a novel offline federated policy optimization algorithm, named $textttO$, which enables distributed agents to collaboratively learn a decision policy only from private and static data.
arXiv Detail & Related papers (2024-05-24T04:24:03Z) - Federated Natural Policy Gradient and Actor Critic Methods for Multi-task Reinforcement Learning [46.28771270378047]
Federated reinforcement learning (RL) enables collaborative decision making of multiple distributed agents without sharing local data trajectories.
In this work, we consider a multi-task setting, in which each agent has its own private reward function corresponding to different tasks, while sharing the same transition kernel of the environment.
We learn a globally optimal policy that maximizes the sum of the discounted total rewards of all the agents in a decentralized manner.
arXiv Detail & Related papers (2023-11-01T00:15:18Z) - Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape [59.841889495864386]
In federated learning (FL), a cluster of local clients are chaired under the coordination of a global server.
Clients are prone to overfit into their own optima, which extremely deviates from the global objective.
ttfamily FedSMOO adopts a dynamic regularizer to guarantee the local optima towards the global objective.
Our theoretical analysis indicates that ttfamily FedSMOO achieves fast $mathcalO (1/T)$ convergence rate with low bound generalization.
arXiv Detail & Related papers (2023-05-19T10:47:44Z) - Federated Learning as Variational Inference: A Scalable Expectation
Propagation Approach [66.9033666087719]
This paper extends the inference view and describes a variational inference formulation of federated learning.
We apply FedEP on standard federated learning benchmarks and find that it outperforms strong baselines in terms of both convergence speed and accuracy.
arXiv Detail & Related papers (2023-02-08T17:58:11Z) - Differentiated Federated Reinforcement Learning Based Traffic Offloading on Space-Air-Ground Integrated Networks [12.080548048901374]
This paper proposes the use of differentiated federated reinforcement learning (DFRL) to solve the traffic offloading problem in SAGIN.
Considering the differentiated characteristics of each region of SAGIN, DFRL models the traffic offloading policy optimization process.
The paper proposes a novel Differentiated Federated Soft Actor-Critic (DFSAC) algorithm to solve the problem.
arXiv Detail & Related papers (2022-12-05T07:40:29Z) - FedKL: Tackling Data Heterogeneity in Federated Reinforcement Learning
by Penalizing KL Divergence [0.0]
Federated Learning (FL) faces the communication bottleneck issue due to many rounds of model synchronization and aggregation.
Heterogeneous data further deteriorates the situation by causing slow convergence.
In this paper, we first define the type and level of data heterogeneity for policy gradient based FRL systems.
arXiv Detail & Related papers (2022-04-18T01:46:59Z) - Policy Mirror Descent for Regularized Reinforcement Learning: A
Generalized Framework with Linear Convergence [60.20076757208645]
This paper proposes a general policy mirror descent (GPMD) algorithm for solving regularized RL.
We demonstrate that our algorithm converges linearly over an entire range learning rates, in a dimension-free fashion, to the global solution.
arXiv Detail & Related papers (2021-05-24T02:21:34Z) - Deep Reinforcement Learning with Robust and Smooth Policy [90.78795857181727]
We propose to learn a smooth policy that behaves smoothly with respect to states.
We develop a new framework -- textbfSmooth textbfRegularized textbfReinforcement textbfLearning ($textbfSR2textbfL$), where the policy is trained with smoothness-inducing regularization.
Such regularization effectively constrains the search space, and enforces smoothness in the learned policy.
arXiv Detail & Related papers (2020-03-21T00:10:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.