Related papers: Approximate Global Convergence of Independent Learning in Multi-Agent Systems

Approximate Global Convergence of Independent Learning in Multi-Agent Systems

URL: http://arxiv.org/abs/2405.19811v1
Date: Thu, 30 May 2024 08:20:34 GMT
Title: Approximate Global Convergence of Independent Learning in Multi-Agent Systems
Authors: Ruiyang Jin, Zaiwei Chen, Yiheng Lin, Jie Song, Adam Wierman,
Abstract summary: We study two representative algorithms, independent $Q$-learning and independent natural actor-critic, within value-based and policy-based frameworks. The results imply a sample complexity of $tildemathcalO(epsilon-2)$ up to an error term that characterizes the fundamental limit of IL in achieving global convergence.
Score: 19.958920582022664
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Independent learning (IL), despite being a popular approach in practice to achieve scalability in large-scale multi-agent systems, usually lacks global convergence guarantees. In this paper, we study two representative algorithms, independent $Q$-learning and independent natural actor-critic, within value-based and policy-based frameworks, and provide the first finite-sample analysis for approximate global convergence. The results imply a sample complexity of $\tilde{\mathcal{O}}(\epsilon^{-2})$ up to an error term that captures the dependence among agents and characterizes the fundamental limit of IL in achieving global convergence. To establish the result, we develop a novel approach for analyzing IL by constructing a separable Markov decision process (MDP) for convergence analysis and then bounding the gap due to model difference between the separable MDP and the original one. Moreover, we conduct numerical experiments using a synthetic MDP and an electric vehicle charging example to verify our theoretical findings and to demonstrate the practical applicability of IL.

Related papers

Cross-Fitting-Free Debiased Machine Learning with Multiway Dependence [0.0]
This paper develops a theory for two-step debiased machine learning (DML) estimators in generalised method of moments (GMM) models with general multiway clustered dependence, without relying on cross-fitting.<n>We show that valid inference can be achieved without sample splitting by combining Neyman-orthogonal moment conditions with a localisation-based empirical approach, allowing for an arbitrary number of clustering dimensions.
arXiv Detail & Related papers (2026-02-11T20:09:23Z)
Finite-Sample Analysis of Policy Evaluation for Robust Average Reward Reinforcement Learning [50.81240969750462]
We present first finite-sample analysis of policy evaluation in robust average Markov Decision Processes (PMDs)<n>We show that the robust Bellman operator is a contraction under a carefully constructed semi-norm, and developing a framework with controlled bias.<n>Our method achieves the order-optimal sample complexity of $tildemathcalO(epsilon-2)$ for robust policy evaluation and robust average reward estimation.
arXiv Detail & Related papers (2025-02-24T03:55:09Z)
Finite-Time Convergence and Sample Complexity of Actor-Critic Multi-Objective Reinforcement Learning [20.491176017183044]
This paper tackles the multi-objective reinforcement learning (MORL) problem. It introduces an innovative actor-critic algorithm named MOAC which finds a policy by iteratively making trade-offs among conflicting reward signals.
arXiv Detail & Related papers (2024-05-05T23:52:57Z)
Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation. We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z)
Sample-Efficient Multi-Agent RL: An Optimization Perspective [103.35353196535544]
We study multi-agent reinforcement learning (MARL) for the general-sum Markov Games (MGs) under the general function approximation. We introduce a novel complexity measure called the Multi-Agent Decoupling Coefficient (MADC) for general-sum MGs. We show that our algorithm provides comparable sublinear regret to the existing works.
arXiv Detail & Related papers (2023-10-10T01:39:04Z)
Partially Observable Multi-Agent Reinforcement Learning with Information Sharing [33.145861021414184]
We study provable multi-agent reinforcement learning (RL) in the general framework of partially observable games (POSGs) We advocate leveraging the potential emph information-sharing among agents, a common practice in empirical multi-agent RL, and a standard model for multi-agent control systems with communications.
arXiv Detail & Related papers (2023-08-16T23:42:03Z)
On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees. We study this question in a general framework for interactive decision making with multiple agents. We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z)
Factorization of Multi-Agent Sampling-Based Motion Planning [72.42734061131569]
Modern robotics often involves multiple embodied agents operating within a shared environment. Standard sampling-based algorithms can be used to search for solutions in the robots' joint space. We integrate the concept of factorization into sampling-based algorithms, which requires only minimal modifications to existing methods. We present a general implementation of a factorized SBA, derive an analytical gain in terms of sample complexity for PRM*, and showcase empirical results for RRG.
arXiv Detail & Related papers (2023-04-01T15:50:18Z)
GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond [101.5329678997916]
We study sample efficient reinforcement learning (RL) under the general framework of interactive decision making. We propose a novel complexity measure, generalized eluder coefficient (GEC), which characterizes the fundamental tradeoff between exploration and exploitation. We show that RL problems with low GEC form a remarkably rich class, which subsumes low Bellman eluder dimension problems, bilinear class, low witness rank problems, PO-bilinear class, and generalized regular PSR.
arXiv Detail & Related papers (2022-11-03T16:42:40Z)
Permutation Invariant Policy Optimization for Mean-Field Multi-Agent Reinforcement Learning: A Principled Approach [128.62787284435007]
We propose the mean-field proximal policy optimization (MF-PPO) algorithm, at the core of which is a permutation-invariant actor-critic neural architecture. We prove that MF-PPO attains the globally optimal policy at a sublinear rate of convergence. In particular, we show that the inductive bias introduced by the permutation-invariant neural architecture enables MF-PPO to outperform existing competitors.
arXiv Detail & Related papers (2021-05-18T04:35:41Z)
Posterior-Aided Regularization for Likelihood-Free Inference [23.708122045184698]
Posterior-Aided Regularization (PAR) is applicable to learning the density estimator, regardless of the model structure. We provide a unified estimation method of PAR to estimate both reverse KL term and mutual information term with a single neural network.
arXiv Detail & Related papers (2021-02-15T16:59:30Z)
Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning [63.64636047748605]
We develop a new theoretical framework to provide convergence guarantee for the general multi-step MAML algorithm. In particular, our results suggest that an inner-stage step needs to be chosen inversely proportional to $N$ of inner-stage steps in order for $N$ MAML to have guaranteed convergence.
arXiv Detail & Related papers (2020-02-18T19:17:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.