Parameter-Free Federated TD Learning with Markov Noise in Heterogeneous Environments
- URL: http://arxiv.org/abs/2510.07436v1
- Date: Wed, 08 Oct 2025 18:36:30 GMT
- Title: Parameter-Free Federated TD Learning with Markov Noise in Heterogeneous Environments
- Authors: Ankur Naskar, Gugan Thoppe, Utsav Negi, Vijay Gupta,
- Abstract summary: Federated learning (FL) can dramatically speed up reinforcement learning by distributing exploration and training across multiple agents.<n>Existing results on TD learning achieving this rate require the algorithm to depend on unknown problem parameters.<n>We propose a two-timescale Federated Temporal Difference (FTD) learning with Polyak-Ruppert averaging.
- Score: 3.4165401459803335
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Federated learning (FL) can dramatically speed up reinforcement learning by distributing exploration and training across multiple agents. It can guarantee an optimal convergence rate that scales linearly in the number of agents, i.e., a rate of $\tilde{O}(1/(NT)),$ where $T$ is the iteration index and $N$ is the number of agents. However, when the training samples arise from a Markov chain, existing results on TD learning achieving this rate require the algorithm to depend on unknown problem parameters. We close this gap by proposing a two-timescale Federated Temporal Difference (FTD) learning with Polyak-Ruppert averaging. Our method provably attains the optimal $\tilde{O}(1/NT)$ rate in both average-reward and discounted settings--offering a parameter-free FTD approach for Markovian data. Although our results are novel even in the single-agent setting, they apply to the more realistic and challenging scenario of FL with heterogeneous environments.
Related papers
- Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture.<n>Non-smooth regularization is often incorporated into machine learning tasks.<n>We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z) - From Continual Learning to SGD and Back: Better Rates for Continual Linear Models [50.11453013647086]
We analyze the forgetting, i.e., loss on previously seen tasks, after $k$ iterations.<n>We develop novel last-iterate upper bounds in the realizable least squares setup.<n>We prove for the first time that randomization alone, with no task repetition, can prevent catastrophic in sufficiently long task sequences.
arXiv Detail & Related papers (2025-04-06T18:39:45Z) - Efficient Federated Learning against Byzantine Attacks and Data Heterogeneity via Aggregating Normalized Gradients [27.433334322019675]
Federated Learning (FL) enables clients to collaboratively train models without sharing raw data.<n>FL is vulnerable to Byzantine attacks and data heterogeneity iterations, which can severely degrade performance.<n>We propose effective Federated Normalized Gradients Algorithm (NGA)<n> Experimental results on benchmark convergence over existing methods.
arXiv Detail & Related papers (2024-08-18T16:50:39Z) - One-Shot Averaging for Distributed TD($λ$) Under Markov Sampling [18.437456273777407]
We show that we can achieve a linear speedup for TD($lambda$), a family of popular methods for policy evaluation, in the sense that $N$ agents can evaluate a policy $N$ times faster.
Notably, this speedup is achieved by one shot averaging,'' a procedure where the agents run TD($lambda$) with Markov sampling independently and only average their results after the final step.
arXiv Detail & Related papers (2024-03-13T18:37:16Z) - On the Convergence of Federated Averaging under Partial Participation for Over-parameterized Neural Networks [13.2844023993979]
Federated learning (FL) is a widely employed distributed paradigm for collaboratively machine learning models from multiple clients without sharing local data.
In this paper, we show that FedAvg converges to a global minimum at a global rate at a global focus.
arXiv Detail & Related papers (2023-10-09T07:56:56Z) - Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs [60.40452803295326]
We propose a new reward-free algorithm for learning linear Markov decision processes (MDPs)
At the core of our algorithm is uncertainty-weighted value-targeted regression with exploration-driven pseudo-reward.
We show that our algorithm only needs to explore $tilde O( d2varepsilon-2)$ episodes to find an $varepsilon$-optimal policy.
arXiv Detail & Related papers (2023-03-17T17:53:28Z) - Federated Learning Using Variance Reduced Stochastic Gradient for
Probabilistically Activated Agents [0.0]
This paper proposes an algorithm for Federated Learning (FL) with a two-layer structure that achieves both variance reduction and a faster convergence rate to an optimal solution in the setting where each agent has an arbitrary probability of selection in each iteration.
arXiv Detail & Related papers (2022-10-25T22:04:49Z) - Offline Reinforcement Learning at Multiple Frequencies [62.08749079914275]
We study how well offline reinforcement learning algorithms can accommodate data with a mixture of frequencies during training.
We present a simple yet effective solution that enforces consistency in the rate of $Q$-value updates to stabilize learning.
arXiv Detail & Related papers (2022-07-26T17:54:49Z) - On Reward-Free RL with Kernel and Neural Function Approximations:
Single-Agent MDP and Markov Game [140.19656665344917]
We study the reward-free RL problem, where an agent aims to thoroughly explore the environment without any pre-specified reward function.
We tackle this problem under the context of function approximation, leveraging powerful function approximators.
We establish the first provably efficient reward-free RL algorithm with kernel and neural function approximators.
arXiv Detail & Related papers (2021-10-19T07:26:33Z) - CFedAvg: Achieving Efficient Communication and Fast Convergence in
Non-IID Federated Learning [8.702106020664612]
Federated learning (FL) is a prevailing distributed learning paradigm, where a large number of workers jointly learn a model without sharing their training data.
High communication costs could arise in FL due to deep-scale (deep) learning models and bandwidth-connected connections.
We introduce a distributed communication datasets called CFedAvg for FL with non-biased SNR-constrained compressors.
arXiv Detail & Related papers (2021-06-14T04:27:19Z) - Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal
Sample Complexity [67.02490430380415]
We show that model-based MARL achieves a sample complexity of $tilde O(|S||B|(gamma)-3epsilon-2)$ for finding the Nash equilibrium (NE) value up to some $epsilon$ error.
We also show that such a sample bound is minimax-optimal (up to logarithmic factors) if the algorithm is reward-agnostic, where the algorithm queries state transition samples without reward knowledge.
arXiv Detail & Related papers (2020-07-15T03:25:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.