The Gradient Convergence Bound of Federated Multi-Agent Reinforcement
Learning with Efficient Communication
- URL: http://arxiv.org/abs/2103.13026v2
- Date: Mon, 29 May 2023 12:53:01 GMT
- Title: The Gradient Convergence Bound of Federated Multi-Agent Reinforcement
Learning with Efficient Communication
- Authors: Xing Xu and Rongpeng Li and Zhifeng Zhao and Honggang Zhang
- Abstract summary: The paper considers independent reinforcement learning (IRL) for collaborative decision-making in the paradigm of federated learning (FL)
FL generates excessive communication overheads between agents and a remote central server.
This paper proposes two advanced optimization schemes to improve the system's utility value.
- Score: 20.891460617583302
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The paper considers independent reinforcement learning (IRL) for multi-agent
collaborative decision-making in the paradigm of federated learning (FL).
However, FL generates excessive communication overheads between agents and a
remote central server, especially when it involves a large number of agents or
iterations. Besides, due to the heterogeneity of independent learning
environments, multiple agents may undergo asynchronous Markov decision
processes (MDPs), which will affect the training samples and the model's
convergence performance. On top of the variation-aware periodic averaging (VPA)
method and the policy-based deep reinforcement learning (DRL) algorithm (i.e.,
proximal policy optimization (PPO)), this paper proposes two advanced
optimization schemes orienting to stochastic gradient descent (SGD): 1) A
decay-based scheme gradually decays the weights of a model's local gradients
with the progress of successive local updates, and 2) By representing the
agents as a graph, a consensus-based scheme studies the impact of exchanging a
model's local gradients among nearby agents from an algebraic connectivity
perspective. This paper also provides novel convergence guarantees for both
developed schemes, and demonstrates their superior effectiveness and efficiency
in improving the system's utility value through theoretical analyses and
simulation results.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Multi-agent Off-policy Actor-Critic Reinforcement Learning for Partially Observable Environments [30.280532078714455]
This study proposes the use of a social learning method to estimate a global state within a multi-agent off-policy actor-critic algorithm for reinforcement learning.
We show that the difference between final outcomes, obtained when the global state is fully observed versus estimated through the social learning method, is $varepsilon$-bounded when an appropriate number of iterations of social learning updates are implemented.
arXiv Detail & Related papers (2024-07-06T06:51:14Z) - Multi-Agent Reinforcement Learning-Based UAV Pathfinding for Obstacle Avoidance in Stochastic Environment [12.122881147337505]
We propose a novel centralized training with decentralized execution method based on multi-agent reinforcement learning.
In our approach, agents communicate only with the centralized planner to make decentralized decisions online.
We conduct multi-step value convergence in multi-agent reinforcement learning to enhance the training efficiency.
arXiv Detail & Related papers (2023-10-25T14:21:22Z) - Exact Subspace Diffusion for Decentralized Multitask Learning [17.592204922442832]
Distributed strategies for multitask learning induce relationships between agents in a more nuanced manner, and encourage collaboration without enforcing consensus.
We develop a generalization of the exact diffusion algorithm for subspace constrained multitask learning over networks, and derive an accurate expression for its mean-squared deviation.
We verify numerically the accuracy of the predicted performance expressions, as well as the improved performance of the proposed approach over alternatives based on approximate projections.
arXiv Detail & Related papers (2023-04-14T19:42:19Z) - IPCC-TP: Utilizing Incremental Pearson Correlation Coefficient for Joint
Multi-Agent Trajectory Prediction [73.25645602768158]
IPCC-TP is a novel relevance-aware module based on Incremental Pearson Correlation Coefficient to improve multi-agent interaction modeling.
Our module can be conveniently embedded into existing multi-agent prediction methods to extend original motion distribution decoders.
arXiv Detail & Related papers (2023-03-01T15:16:56Z) - Towards Global Optimality in Cooperative MARL with the Transformation
And Distillation Framework [26.612749327414335]
Decentralized execution is one core demand in cooperative multi-agent reinforcement learning (MARL)
In this paper, we theoretically analyze two common classes of algorithms with decentralized policies -- multi-agent policy gradient methods and value-decomposition methods.
We show that TAD-PPO can theoretically perform optimal policy learning in the finite multi-agent MDPs and shows significant outperformance on a large set of cooperative multi-agent tasks.
arXiv Detail & Related papers (2022-07-12T06:59:13Z) - Communication-Efficient Consensus Mechanism for Federated Reinforcement
Learning [20.891460617583302]
We show that FL can improve the policy performance of IRL in terms of training efficiency and stability.
To reach a good balance between improving the model's convergence performance and reducing the required communication and computation overheads, this paper proposes a system utility function.
arXiv Detail & Related papers (2022-01-30T04:04:24Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z) - Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search.
We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.