Collaborative Value Function Estimation Under Model Mismatch: A Federated Temporal Difference Analysis
- URL: http://arxiv.org/abs/2503.17454v1
- Date: Fri, 21 Mar 2025 18:06:28 GMT
- Title: Collaborative Value Function Estimation Under Model Mismatch: A Federated Temporal Difference Analysis
- Authors: Ali Beikmohammadi, Sarit Khirirat, Peter Richtárik, Sindri Magnússon,
- Abstract summary: Federated reinforcement learning (FedRL) enables collaborative learning while preserving data privacy by preventing direct data exchange between agents.<n>In real-world applications, each agent may experience slightly different transition dynamics, leading to inherent model mismatches.<n>We show that even moderate levels of information sharing can significantly mitigate environment-specific errors.
- Score: 55.13545823385091
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Federated reinforcement learning (FedRL) enables collaborative learning while preserving data privacy by preventing direct data exchange between agents. However, many existing FedRL algorithms assume that all agents operate in identical environments, which is often unrealistic. In real-world applications -- such as multi-robot teams, crowdsourced systems, and large-scale sensor networks -- each agent may experience slightly different transition dynamics, leading to inherent model mismatches. In this paper, we first establish linear convergence guarantees for single-agent temporal difference learning (TD(0)) in policy evaluation and demonstrate that under a perturbed environment, the agent suffers a systematic bias that prevents accurate estimation of the true value function. This result holds under both i.i.d. and Markovian sampling regimes. We then extend our analysis to the federated TD(0) (FedTD(0)) setting, where multiple agents -- each interacting with its own perturbed environment -- periodically share value estimates to collaboratively approximate the true value function of a common underlying model. Our theoretical results indicate the impact of model mismatch, network connectivity, and mixing behavior on the convergence of FedTD(0). Empirical experiments corroborate our theoretical gains, highlighting that even moderate levels of information sharing can significantly mitigate environment-specific errors.
Related papers
- Can We Validate Counterfactual Estimations in the Presence of General Network Interference? [6.092214762701847]
We introduce a new framework enabling cross-validation for counterfactual estimation.<n>At its core is our distribution-preserving network bootstrap method.<n>We extend recent causal message-passing developments by incorporating heterogeneous unit-level characteristics.
arXiv Detail & Related papers (2025-02-03T06:51:04Z) - Single-Loop Federated Actor-Critic across Heterogeneous Environments [9.276123988094698]
We study textitSingle-loop Federated Actor Critic (SFAC) where agents perform actor-critic learning in a two-level federated manner.<n>We show that the convergence error of SFAC converges to a near-stationary point, with the extent proportional to environment.
arXiv Detail & Related papers (2024-12-19T06:13:59Z) - Reducing Spurious Correlation for Federated Domain Generalization [15.864230656989854]
In open-world scenarios, global models may struggle to predict well on entirely new domain data captured by certain media.
Existing methods still rely on strong statistical correlations between samples and labels to address this issue.
We introduce FedCD, an overall optimization framework at both the local and global levels.
arXiv Detail & Related papers (2024-07-27T05:06:31Z) - FedGen: Generalizable Federated Learning for Sequential Data [8.784435748969806]
In many real-world distributed settings, spurious correlations exist due to biases and data sampling issues.
We present a generalizable federated learning framework called FedGen, which allows clients to identify and distinguish between spurious and invariant features.
We show that FedGen results in models that achieve significantly better generalization and can outperform the accuracy of current federated learning approaches by over 24%.
arXiv Detail & Related papers (2022-11-03T15:48:14Z) - Assaying Out-Of-Distribution Generalization in Transfer Learning [103.57862972967273]
We take a unified view of previous work, highlighting message discrepancies that we address empirically.
We fine-tune over 31k networks, from nine different architectures in the many- and few-shot setting.
arXiv Detail & Related papers (2022-07-19T12:52:33Z) - DRFLM: Distributionally Robust Federated Learning with Inter-client
Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data.
We propose a general framework to solve the above two challenges simultaneously.
We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z) - FedRAD: Federated Robust Adaptive Distillation [7.775374800382709]
Collaborative learning framework by typically aggregating model updates is vulnerable to model poisoning attacks from adversarial clients.
We propose a novel robust aggregation method, Federated Robust Adaptive Distillation (FedRAD), to detect adversaries and robustly aggregate local models.
The results show that FedRAD outperforms all other aggregators in the presence of adversaries, as well as in heterogeneous data distributions.
arXiv Detail & Related papers (2021-12-02T16:50:57Z) - Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma
Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result.
Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z) - Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions.
In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems.
Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.