FedHPD: Heterogeneous Federated Reinforcement Learning via Policy Distillation
- URL: http://arxiv.org/abs/2502.00870v1
- Date: Sun, 02 Feb 2025 18:44:08 GMT
- Title: FedHPD: Heterogeneous Federated Reinforcement Learning via Policy Distillation
- Authors: Wenzheng Jiang, Ji Wang, Xiongtao Zhang, Weidong Bao, Cheston Tan, Flint Xiaofeng Fan,
- Abstract summary: This paper investigates Federated Reinforcement Learning (FedRL) in black-box settings with heterogeneous agents.
FedHPD shows significant improvements across various reinforcement learning benchmark tasks.
- Score: 9.705155801292953
- License:
- Abstract: Federated Reinforcement Learning (FedRL) improves sample efficiency while preserving privacy; however, most existing studies assume homogeneous agents, limiting its applicability in real-world scenarios. This paper investigates FedRL in black-box settings with heterogeneous agents, where each agent employs distinct policy networks and training configurations without disclosing their internal details. Knowledge Distillation (KD) is a promising method for facilitating knowledge sharing among heterogeneous models, but it faces challenges related to the scarcity of public datasets and limitations in knowledge representation when applied to FedRL. To address these challenges, we propose Federated Heterogeneous Policy Distillation (FedHPD), which solves the problem of heterogeneous FedRL by utilizing action probability distributions as a medium for knowledge sharing. We provide a theoretical analysis of FedHPD's convergence under standard assumptions. Extensive experiments corroborate that FedHPD shows significant improvements across various reinforcement learning benchmark tasks, further validating our theoretical findings. Moreover, additional experiments demonstrate that FedHPD operates effectively without the need for an elaborate selection of public datasets.
Related papers
- Exploratory Diffusion Policy for Unsupervised Reinforcement Learning [28.413426177336703]
Unsupervised reinforcement learning aims to pre-train agents by exploring states or skills in reward-free environments.
Existing methods often overlook the fitting ability of pre-trained policies and struggle to handle the heterogeneous pre-training data.
We propose Exploratory Diffusion Policy (EDP), which leverages the strong expressive ability of diffusion models to fit the explored data.
arXiv Detail & Related papers (2025-02-11T05:48:51Z) - Preference-Based Multi-Agent Reinforcement Learning: Data Coverage and Algorithmic Techniques [65.55451717632317]
We study Preference-Based Multi-Agent Reinforcement Learning (PbMARL)
We identify the Nash equilibrium from a preference-only offline dataset in general-sum games.
Our findings underscore the multifaceted approach required for PbMARL.
arXiv Detail & Related papers (2024-09-01T13:14:41Z) - Efficient Conformal Prediction under Data Heterogeneity [79.35418041861327]
Conformal Prediction (CP) stands out as a robust framework for uncertainty quantification.
Existing approaches for tackling non-exchangeability lead to methods that are not computable beyond the simplest examples.
This work introduces a new efficient approach to CP that produces provably valid confidence sets for fairly general non-exchangeable data distributions.
arXiv Detail & Related papers (2023-12-25T20:02:51Z) - Multiply Robust Federated Estimation of Targeted Average Treatment
Effects [0.0]
We propose a novel approach to derive valid causal inferences for a target population using multi-site data.
Our methodology incorporates transfer learning to estimate ensemble weights to combine information from source sites.
arXiv Detail & Related papers (2023-09-22T03:15:08Z) - Selective Knowledge Sharing for Privacy-Preserving Federated
Distillation without A Good Teacher [52.2926020848095]
Federated learning is vulnerable to white-box attacks and struggles to adapt to heterogeneous clients.
This paper proposes a selective knowledge sharing mechanism for FD, termed Selective-FD.
arXiv Detail & Related papers (2023-04-04T12:04:19Z) - Combating Exacerbated Heterogeneity for Robust Models in Federated
Learning [91.88122934924435]
Combination of adversarial training and federated learning can lead to the undesired robustness deterioration.
We propose a novel framework called Slack Federated Adversarial Training (SFAT)
We verify the rationality and effectiveness of SFAT on various benchmarked and real-world datasets.
arXiv Detail & Related papers (2023-03-01T06:16:15Z) - FedHQL: Federated Heterogeneous Q-Learning [32.01715758422344]
Federated Reinforcement Learning (FedRL) encourages distributed agents to learn collectively from each other's experience to improve their performance without exchanging their raw trajectories.
In real-world applications, agents are often in disagreement about the architecture and the parameters, possibly also because of disparate computational budgets.
We present the unique challenges this new setting poses and propose the Federated Heterogeneous Q-Learning (FedHQL) algorithm that principally addresses these challenges.
arXiv Detail & Related papers (2023-01-26T14:39:34Z) - Offline Reinforcement Learning with Instrumental Variables in Confounded
Markov Decision Processes [93.61202366677526]
We study the offline reinforcement learning (RL) in the face of unmeasured confounders.
We propose various policy learning methods with the finite-sample suboptimality guarantee of finding the optimal in-class policy.
arXiv Detail & Related papers (2022-09-18T22:03:55Z) - Reinforcement Learning with Heterogeneous Data: Estimation and Inference [84.72174994749305]
We introduce the K-Heterogeneous Markov Decision Process (K-Hetero MDP) to address sequential decision problems with population heterogeneity.
We propose the Auto-Clustered Policy Evaluation (ACPE) for estimating the value of a given policy, and the Auto-Clustered Policy Iteration (ACPI) for estimating the optimal policy in a given policy class.
We present simulations to support our theoretical findings, and we conduct an empirical study on the standard MIMIC-III dataset.
arXiv Detail & Related papers (2022-01-31T20:58:47Z) - Fault-Tolerant Federated Reinforcement Learning with Theoretical
Guarantee [25.555844784263236]
We propose the first Federated Reinforcement Learning framework that is tolerant to less than half of the participating agents being random system failures or adversarial attackers.
All theoretical results are empirically verified on various RL benchmark tasks.
arXiv Detail & Related papers (2021-10-26T23:01:22Z) - Deep Stable Learning for Out-Of-Distribution Generalization [27.437046504902938]
Approaches based on deep neural networks have achieved striking performance when testing data and training data share similar distribution.
Eliminating the impact of distribution shifts between training and testing data is crucial for building performance-promising deep models.
We propose to address this problem by removing the dependencies between features via learning weights for training samples.
arXiv Detail & Related papers (2021-04-16T03:54:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.