Partially Observable Multi-agent RL with (Quasi-)Efficiency: The
Blessing of Information Sharing
- URL: http://arxiv.org/abs/2308.08705v2
- Date: Thu, 29 Feb 2024 04:25:14 GMT
- Title: Partially Observable Multi-agent RL with (Quasi-)Efficiency: The
Blessing of Information Sharing
- Authors: Xiangyu Liu, Kaiqing Zhang
- Abstract summary: We study provable multi-agent reinforcement learning (MARL) in the general framework of partially observable games (POSGs)
We advocate leveraging the potential emph information-sharing among agents, a common practice in empirical MARL, and a standard model for multi-agent control systems with communications.
- Score: 39.15744391171533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study provable multi-agent reinforcement learning (MARL) in the general
framework of partially observable stochastic games (POSGs). To circumvent the
known hardness results and the use of computationally intractable oracles, we
advocate leveraging the potential \emph{information-sharing} among agents, a
common practice in empirical MARL, and a standard model for multi-agent control
systems with communications. We first establish several computation complexity
results to justify the necessity of information-sharing, as well as the
observability assumption that has enabled quasi-efficient single-agent RL with
partial observations, for computational efficiency in solving POSGs. We then
propose to further \emph{approximate} the shared common information to
construct an {approximate model} of the POSG, in which planning an approximate
equilibrium (in terms of solving the original POSG) can be quasi-efficient,
i.e., of quasi-polynomial-time, under the aforementioned assumptions.
Furthermore, we develop a partially observable MARL algorithm that is both
statistically and computationally quasi-efficient. We hope our study may open
up the possibilities of leveraging and even designing different
\emph{information structures}, for developing both sample- and
computation-efficient partially observable MARL.
Related papers
- Approximate Global Convergence of Independent Learning in Multi-Agent Systems [19.958920582022664]
We study two representative algorithms, independent $Q$-learning and independent natural actor-critic, within value-based and policy-based frameworks.
The results imply a sample complexity of $tildemathcalO(epsilon-2)$ up to an error term that characterizes the fundamental limit of IL in achieving global convergence.
arXiv Detail & Related papers (2024-05-30T08:20:34Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - Sample-Efficient Multi-Agent RL: An Optimization Perspective [103.35353196535544]
We study multi-agent reinforcement learning (MARL) for the general-sum Markov Games (MGs) under the general function approximation.
We introduce a novel complexity measure called the Multi-Agent Decoupling Coefficient (MADC) for general-sum MGs.
We show that our algorithm provides comparable sublinear regret to the existing works.
arXiv Detail & Related papers (2023-10-10T01:39:04Z) - On the Complexity of Multi-Agent Decision Making: From Learning in Games
to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z) - Factorization of Multi-Agent Sampling-Based Motion Planning [72.42734061131569]
Modern robotics often involves multiple embodied agents operating within a shared environment.
Standard sampling-based algorithms can be used to search for solutions in the robots' joint space.
We integrate the concept of factorization into sampling-based algorithms, which requires only minimal modifications to existing methods.
We present a general implementation of a factorized SBA, derive an analytical gain in terms of sample complexity for PRM*, and showcase empirical results for RRG.
arXiv Detail & Related papers (2023-04-01T15:50:18Z) - A General Framework for Sample-Efficient Function Approximation in
Reinforcement Learning [132.45959478064736]
We propose a general framework that unifies model-based and model-free reinforcement learning.
We propose a novel estimation function with decomposable structural properties for optimization-based exploration.
Under our framework, a new sample-efficient algorithm namely OPtimization-based ExploRation with Approximation (OPERA) is proposed.
arXiv Detail & Related papers (2022-09-30T17:59:16Z) - Centralized Model and Exploration Policy for Multi-Agent RL [13.661446184763117]
Reinforcement learning in partially observable, fully cooperative multi-agent settings (Dec-POMDPs) can be used to address many real-world challenges.
Current RL algorithms for Dec-POMDPs suffer from poor sample complexity.
We propose a model-based algorithm, MARCO, in three cooperative communication tasks, where it improves sample efficiency by up to 20x.
arXiv Detail & Related papers (2021-07-14T00:34:08Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.