Sample-Efficient Multi-Agent RL: An Optimization Perspective
- URL: http://arxiv.org/abs/2310.06243v1
- Date: Tue, 10 Oct 2023 01:39:04 GMT
- Title: Sample-Efficient Multi-Agent RL: An Optimization Perspective
- Authors: Nuoya Xiong, Zhihan Liu, Zhaoran Wang, Zhuoran Yang
- Abstract summary: We study multi-agent reinforcement learning (MARL) for the general-sum Markov Games (MGs) under the general function approximation.
We introduce a novel complexity measure called the Multi-Agent Decoupling Coefficient (MADC) for general-sum MGs.
We show that our algorithm provides comparable sublinear regret to the existing works.
- Score: 103.35353196535544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study multi-agent reinforcement learning (MARL) for the general-sum Markov
Games (MGs) under the general function approximation. In order to find the
minimum assumption for sample-efficient learning, we introduce a novel
complexity measure called the Multi-Agent Decoupling Coefficient (MADC) for
general-sum MGs. Using this measure, we propose the first unified algorithmic
framework that ensures sample efficiency in learning Nash Equilibrium, Coarse
Correlated Equilibrium, and Correlated Equilibrium for both model-based and
model-free MARL problems with low MADC. We also show that our algorithm
provides comparable sublinear regret to the existing works. Moreover, our
algorithm combines an equilibrium-solving oracle with a single objective
optimization subprocedure that solves for the regularized payoff of each
deterministic joint policy, which avoids solving constrained optimization
problems within data-dependent constraints (Jin et al. 2020; Wang et al. 2023)
or executing sampling procedures with complex multi-objective optimization
problems (Foster et al. 2023), thus being more amenable to empirical
implementation.
Related papers
- Optimization by Parallel Quasi-Quantum Annealing with Gradient-Based Sampling [0.0]
This study proposes a different approach that integrates gradient-based update through continuous relaxation, combined with Quasi-Quantum Annealing (QQA)
Numerical experiments demonstrate that our method is a competitive general-purpose solver, achieving performance comparable to iSCO and learning-based solvers.
arXiv Detail & Related papers (2024-09-02T12:55:27Z) - Expensive Multi-Objective Bayesian Optimization Based on Diffusion Models [17.19004913553654]
Multi-objective Bayesian optimization (MOBO) has shown promising performance on various expensive multi-objective optimization problems (EMOPs)
We propose a novel Composite Diffusion Model based Pareto Set Learning algorithm, namely CDM-PSL, for expensive MOBO.
Our proposed algorithm attains superior performance compared with various state-of-the-art MOBO algorithms.
arXiv Detail & Related papers (2024-05-14T14:55:57Z) - Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning [50.92957910121088]
This work designs and analyzes a novel set of algorithms for multi-agent reinforcement learning (MARL) based on the principle of information-directed sampling (IDS)
For episodic two-player zero-sum MGs, we present three sample-efficient algorithms for learning Nash equilibrium.
We extend Reg-MAIDS to multi-player general-sum MGs and prove that it can learn either the Nash equilibrium or coarse correlated equilibrium in a sample efficient manner.
arXiv Detail & Related papers (2024-04-30T06:48:56Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Efficient Alternating Minimization Solvers for Wyner Multi-View
Unsupervised Learning [0.0]
We propose two novel formulations that enable the development of computational efficient solvers based the alternating principle.
The proposed solvers offer computational efficiency, theoretical convergence guarantees, local minima complexity with the number of views, and exceptional accuracy as compared with the state-of-the-art techniques.
arXiv Detail & Related papers (2023-03-28T10:17:51Z) - Saddle Point Optimization with Approximate Minimization Oracle and its
Application to Robust Berthing Control [7.347989843033034]
We propose an approach to saddle point optimization relying only on oracles that solve minimization problems approximately.
We analyze its convergence property on a strongly convex--concave problem and show its linear convergence toward the global min-max saddle point.
An implementation of the developed approach using the (1+1)-CMA-ES as the minimization oracle, namely Adversarial-CMA-ES, is shown to outperform several existing approaches on test problems.
arXiv Detail & Related papers (2021-05-25T00:08:47Z) - Compositionality of Linearly Solvable Optimal Control in Networked
Multi-Agent Systems [27.544923751902807]
We discuss the methodology of generalizing the optimal control law from learned component tasks to unlearned composite tasks on Multi-Agent Systems (MASs)
The proposed approach achieves both the compositionality and optimality of control actions simultaneously within the cooperative MAS framework in both discrete- and continuous-time in a sample-efficient manner.
arXiv Detail & Related papers (2020-09-28T20:21:48Z) - Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning [63.64636047748605]
We develop a new theoretical framework to provide convergence guarantee for the general multi-step MAML algorithm.
In particular, our results suggest that an inner-stage step needs to be chosen inversely proportional to $N$ of inner-stage steps in order for $N$ MAML to have guaranteed convergence.
arXiv Detail & Related papers (2020-02-18T19:17:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.