Approximate Global Convergence of Independent Learning in Multi-Agent Systems
- URL: http://arxiv.org/abs/2405.19811v1
- Date: Thu, 30 May 2024 08:20:34 GMT
- Title: Approximate Global Convergence of Independent Learning in Multi-Agent Systems
- Authors: Ruiyang Jin, Zaiwei Chen, Yiheng Lin, Jie Song, Adam Wierman,
- Abstract summary: We study two representative algorithms, independent $Q$-learning and independent natural actor-critic, within value-based and policy-based frameworks.
The results imply a sample complexity of $tildemathcalO(epsilon-2)$ up to an error term that characterizes the fundamental limit of IL in achieving global convergence.
- Score: 19.958920582022664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Independent learning (IL), despite being a popular approach in practice to achieve scalability in large-scale multi-agent systems, usually lacks global convergence guarantees. In this paper, we study two representative algorithms, independent $Q$-learning and independent natural actor-critic, within value-based and policy-based frameworks, and provide the first finite-sample analysis for approximate global convergence. The results imply a sample complexity of $\tilde{\mathcal{O}}(\epsilon^{-2})$ up to an error term that captures the dependence among agents and characterizes the fundamental limit of IL in achieving global convergence. To establish the result, we develop a novel approach for analyzing IL by constructing a separable Markov decision process (MDP) for convergence analysis and then bounding the gap due to model difference between the separable MDP and the original one. Moreover, we conduct numerical experiments using a synthetic MDP and an electric vehicle charging example to verify our theoretical findings and to demonstrate the practical applicability of IL.
Related papers
- Finite-Time Convergence and Sample Complexity of Actor-Critic Multi-Objective Reinforcement Learning [20.491176017183044]
This paper tackles the multi-objective reinforcement learning (MORL) problem.
It introduces an innovative actor-critic algorithm named MOAC which finds a policy by iteratively making trade-offs among conflicting reward signals.
arXiv Detail & Related papers (2024-05-05T23:52:57Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - Sample-Efficient Multi-Agent RL: An Optimization Perspective [103.35353196535544]
We study multi-agent reinforcement learning (MARL) for the general-sum Markov Games (MGs) under the general function approximation.
We introduce a novel complexity measure called the Multi-Agent Decoupling Coefficient (MADC) for general-sum MGs.
We show that our algorithm provides comparable sublinear regret to the existing works.
arXiv Detail & Related papers (2023-10-10T01:39:04Z) - Partially Observable Multi-Agent Reinforcement Learning with Information Sharing [33.145861021414184]
We study provable multi-agent reinforcement learning (RL) in the general framework of partially observable games (POSGs)
We advocate leveraging the potential emph information-sharing among agents, a common practice in empirical multi-agent RL, and a standard model for multi-agent control systems with communications.
arXiv Detail & Related papers (2023-08-16T23:42:03Z) - On the Complexity of Multi-Agent Decision Making: From Learning in Games
to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z) - Factorization of Multi-Agent Sampling-Based Motion Planning [72.42734061131569]
Modern robotics often involves multiple embodied agents operating within a shared environment.
Standard sampling-based algorithms can be used to search for solutions in the robots' joint space.
We integrate the concept of factorization into sampling-based algorithms, which requires only minimal modifications to existing methods.
We present a general implementation of a factorized SBA, derive an analytical gain in terms of sample complexity for PRM*, and showcase empirical results for RRG.
arXiv Detail & Related papers (2023-04-01T15:50:18Z) - GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP,
and Beyond [101.5329678997916]
We study sample efficient reinforcement learning (RL) under the general framework of interactive decision making.
We propose a novel complexity measure, generalized eluder coefficient (GEC), which characterizes the fundamental tradeoff between exploration and exploitation.
We show that RL problems with low GEC form a remarkably rich class, which subsumes low Bellman eluder dimension problems, bilinear class, low witness rank problems, PO-bilinear class, and generalized regular PSR.
arXiv Detail & Related papers (2022-11-03T16:42:40Z) - Posterior-Aided Regularization for Likelihood-Free Inference [23.708122045184698]
Posterior-Aided Regularization (PAR) is applicable to learning the density estimator, regardless of the model structure.
We provide a unified estimation method of PAR to estimate both reverse KL term and mutual information term with a single neural network.
arXiv Detail & Related papers (2021-02-15T16:59:30Z) - Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning [63.64636047748605]
We develop a new theoretical framework to provide convergence guarantee for the general multi-step MAML algorithm.
In particular, our results suggest that an inner-stage step needs to be chosen inversely proportional to $N$ of inner-stage steps in order for $N$ MAML to have guaranteed convergence.
arXiv Detail & Related papers (2020-02-18T19:17:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.