FedHQL: Federated Heterogeneous Q-Learning
- URL: http://arxiv.org/abs/2301.11135v1
- Date: Thu, 26 Jan 2023 14:39:34 GMT
- Title: FedHQL: Federated Heterogeneous Q-Learning
- Authors: Flint Xiaofeng Fan, Yining Ma, Zhongxiang Dai, Cheston Tan, Bryan Kian
Hsiang Low, Roger Wattenhofer
- Abstract summary: Federated Reinforcement Learning (FedRL) encourages distributed agents to learn collectively from each other's experience to improve their performance without exchanging their raw trajectories.
In real-world applications, agents are often in disagreement about the architecture and the parameters, possibly also because of disparate computational budgets.
We present the unique challenges this new setting poses and propose the Federated Heterogeneous Q-Learning (FedHQL) algorithm that principally addresses these challenges.
- Score: 32.01715758422344
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Reinforcement Learning (FedRL) encourages distributed agents to
learn collectively from each other's experience to improve their performance
without exchanging their raw trajectories. The existing work on FedRL assumes
that all participating agents are homogeneous, which requires all agents to
share the same policy parameterization (e.g., network architectures and
training configurations). However, in real-world applications, agents are often
in disagreement about the architecture and the parameters, possibly also
because of disparate computational budgets. Because homogeneity is not given in
practice, we introduce the problem setting of Federated Reinforcement Learning
with Heterogeneous And bLack-box agEnts (FedRL-HALE). We present the unique
challenges this new setting poses and propose the Federated Heterogeneous
Q-Learning (FedHQL) algorithm that principally addresses these challenges. We
empirically demonstrate the efficacy of FedHQL in boosting the sample
efficiency of heterogeneous agents with distinct policy parameterization using
standard RL tasks.
Related papers
- Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models [83.02797560769285]
Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data.
Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts.
We propose Task Groupings Regularization, a novel approach that benefits from model heterogeneity by grouping and aligning conflicting tasks.
arXiv Detail & Related papers (2024-05-26T13:11:55Z) - Zero-Sum Positional Differential Games as a Framework for Robust Reinforcement Learning: Deep Q-Learning Approach [2.3020018305241337]
This paper is the first to propose considering the RRL problems within the positional differential game theory.
Namely, we prove that under Isaacs's condition, the same Q-function can be utilized as an approximate solution of both minimax and maximin Bellman equations.
We present the Isaacs Deep Q-Network algorithms and demonstrate their superiority compared to other baseline RRL and Multi-Agent RL algorithms in various environments.
arXiv Detail & Related papers (2024-05-03T12:21:43Z) - CAESAR: Enhancing Federated RL in Heterogeneous MDPs through Convergence-Aware Sampling with Screening [24.230715083647173]
Existing FedRL methods typically aggregate agents' learning by averaging the value functions across them to improve their performance.
We introduce the Convergence-AwarE SAmpling with scReening (CAESAR) aggregation scheme to enhance the learning of individual agents across varied MDPs.
We empirically validate our hypothesis and demonstrate the effectiveness of CAESAR in enhancing the learning efficiency of agents.
arXiv Detail & Related papers (2024-03-29T13:05:59Z) - Causal Coordinated Concurrent Reinforcement Learning [8.654978787096807]
We propose a novel algorithmic framework for data sharing and coordinated exploration for the purpose of learning more data-efficient and better performing policies under a concurrent reinforcement learning setting.
Our algorithm leverages a causal inference algorithm in the form of Additive Noise Model - Mixture Model (ANM-MM) in extracting model parameters governing individual differentials via independence enforcement.
We propose a new data sharing scheme based on a similarity measure of the extracted model parameters and demonstrate superior learning speeds on a set of autoregressive, pendulum and cart-pole swing-up tasks.
arXiv Detail & Related papers (2024-01-31T17:20:28Z) - Federated Natural Policy Gradient Methods for Multi-task Reinforcement
Learning [49.65958529941962]
Federated reinforcement learning (RL) enables collaborative decision making of multiple distributed agents without sharing local data trajectories.
In this work, we consider a multi-task setting, in which each agent has its own private reward function corresponding to different tasks.
We learn a globally optimal policy that maximizes the sum of the discounted total rewards of all the agents in a decentralized manner.
arXiv Detail & Related papers (2023-11-01T00:15:18Z) - Every Parameter Matters: Ensuring the Convergence of Federated Learning
with Dynamic Heterogeneous Models Reduction [22.567754688492414]
Cross-device Federated Learning (FL) faces significant challenges where low-end clients that could potentially make unique contributions are excluded from training large models due to their resource bottlenecks.
Recent research efforts have focused on model-heterogeneous FL, by extracting reduced-size models from the global model and applying them to local clients accordingly.
This paper presents a unifying framework for heterogeneous FL algorithms with online model extraction and provides a general convergence analysis for the first time.
arXiv Detail & Related papers (2023-10-12T19:07:58Z) - The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup
and Beyond [44.43850105124659]
We consider federated Q-learning, which aims to learn an optimal Q-function by periodically aggregating local Q-estimates trained on local data alone.
We provide sample complexity guarantees for both the synchronous and asynchronous variants of federated Q-learning.
We propose a novel federated Q-learning algorithm with importance averaging, giving larger weights to more frequently visited state-action pairs.
arXiv Detail & Related papers (2023-05-18T04:18:59Z) - Heterogeneous Federated Learning via Grouped Sequential-to-Parallel
Training [60.892342868936865]
Federated learning (FL) is a rapidly growing privacy-preserving collaborative machine learning paradigm.
We propose a data heterogeneous-robust FL approach, FedGSP, to address this challenge.
We show that FedGSP improves the accuracy by 3.7% on average compared with seven state-of-the-art approaches.
arXiv Detail & Related papers (2022-01-31T03:15:28Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.