Anytime-Constrained Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2410.23637v1
- Date: Thu, 31 Oct 2024 05:07:01 GMT
- Title: Anytime-Constrained Multi-Agent Reinforcement Learning
- Authors: Jeremy McMahan, Xiaojin Zhu,
- Abstract summary: We introduce anytime constraints to the multi-agent setting with the corresponding solution being anytime-constrained equilibrium (ACE)
We present a comprehensive theory of anytime-constrained Markov games, which includes a computational characterization of feasible policies.
We also first theory of efficient computation for action-constrained Markov games, which may be of independent interest.
- Score: 6.981971551979697
- License:
- Abstract: We introduce anytime constraints to the multi-agent setting with the corresponding solution concept being anytime-constrained equilibrium (ACE). Then, we present a comprehensive theory of anytime-constrained Markov games, which includes (1) a computational characterization of feasible policies, (2) a fixed-parameter tractable algorithm for computing ACE, and (3) a polynomial-time algorithm for approximately computing feasible ACE. Since computing a feasible policy is NP-hard even for two-player zero-sum games, our approximation guarantees are the best possible under worst-case analysis. We also develop the first theory of efficient computation for action-constrained Markov games, which may be of independent interest.
Related papers
- Deterministic Policies for Constrained Reinforcement Learning in Polynomial Time [1.223779595809275]
Our algorithm efficiently computes near-optimal deterministic policies for constrained reinforcement learning problems.
Our work answers three open questions spanning two long-standing lines of research.
arXiv Detail & Related papers (2024-05-23T05:27:51Z) - Independent Learning in Constrained Markov Potential Games [19.083595175045073]
Constrained Markov games offer a formal framework for modeling multi-agent reinforcement learning problems.
We propose an independent policy gradient algorithm for learning approximate constrained Nash equilibria.
arXiv Detail & Related papers (2024-02-27T20:57:35Z) - Hardness of Independent Learning and Sparse Equilibrium Computation in
Markov Games [70.19141208203227]
We consider the problem of decentralized multi-agent reinforcement learning in Markov games.
We show that no algorithm attains no-regret in general-sum games when executed independently by all players.
We show that our lower bounds hold even for seemingly easier setting in which all agents are controlled by a centralized algorithm.
arXiv Detail & Related papers (2023-03-22T03:28:12Z) - Efficiently Computing Nash Equilibria in Adversarial Team Markov Games [19.717850955051837]
We introduce a class of games in which a team identically players is competing against an adversarial player.
This setting allows for a unifying treatment of zero-sum Markov games potential games.
Our main contribution is the first algorithm for computing stationary $epsilon$-approximate Nash equilibria in adversarial team Markov games.
arXiv Detail & Related papers (2022-08-03T16:41:01Z) - Provably Efficient Fictitious Play Policy Optimization for Zero-Sum
Markov Games with Structured Transitions [145.54544979467872]
We propose and analyze new fictitious play policy optimization algorithms for zero-sum Markov games with structured but unknown transitions.
We prove tight $widetildemathcalO(sqrtK)$ regret bounds after $K$ episodes in a two-agent competitive game scenario.
Our algorithms feature a combination of Upper Confidence Bound (UCB)-type optimism and fictitious play under the scope of simultaneous policy optimization.
arXiv Detail & Related papers (2022-07-25T18:29:16Z) - The Complexity of Markov Equilibrium in Stochastic Games [44.77547027158141]
We show that computing approximate stationary Markov coarse correlated equilibria (CCE) in general-sum games is computationally intractable.
Our results stand in sharp contrast to normal-form games where exact CCEs are efficiently computable.
arXiv Detail & Related papers (2022-04-08T10:51:01Z) - Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z) - Efficient semidefinite-programming-based inference for binary and
multi-class MRFs [83.09715052229782]
We propose an efficient method for computing the partition function or MAP estimate in a pairwise MRF.
We extend semidefinite relaxations from the typical binary MRF to the full multi-class setting, and develop a compact semidefinite relaxation that can again be solved efficiently using the solver.
arXiv Detail & Related papers (2020-12-04T15:36:29Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z) - Public Bayesian Persuasion: Being Almost Optimal and Almost Persuasive [57.47546090379434]
We study the public persuasion problem in the general setting with: (i) arbitrary state spaces; (ii) arbitrary action spaces; (iii) arbitrary sender's utility functions.
We provide a quasi-polynomial time bi-criteria approximation algorithm for arbitrary public persuasion problems that, in specific settings, yields a QPTAS.
arXiv Detail & Related papers (2020-02-12T18:59:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.