Anytime-Constrained Multi-Agent Reinforcement Learning
        - URL: http://arxiv.org/abs/2410.23637v1
- Date: Thu, 31 Oct 2024 05:07:01 GMT
- Title: Anytime-Constrained Multi-Agent Reinforcement Learning
- Authors: Jeremy McMahan, Xiaojin Zhu, 
- Abstract summary: We introduce anytime constraints to the multi-agent setting with the corresponding solution being anytime-constrained equilibrium (ACE)
We present a comprehensive theory of anytime-constrained Markov games, which includes a computational characterization of feasible policies.
We also first theory of efficient computation for action-constrained Markov games, which may be of independent interest.
- Score: 6.981971551979697
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract:   We introduce anytime constraints to the multi-agent setting with the corresponding solution concept being anytime-constrained equilibrium (ACE). Then, we present a comprehensive theory of anytime-constrained Markov games, which includes (1) a computational characterization of feasible policies, (2) a fixed-parameter tractable algorithm for computing ACE, and (3) a polynomial-time algorithm for approximately computing feasible ACE. Since computing a feasible policy is NP-hard even for two-player zero-sum games, our approximation guarantees are the best possible under worst-case analysis. We also develop the first theory of efficient computation for action-constrained Markov games, which may be of independent interest. 
 
      
        Related papers
        - Efficiently Solving Turn-Taking Stochastic Games with Extensive-Form   Correlation [52.16923999754027]
 We give an algorithm for computing a Stackelberg extensive-form correlated equilibrium.
We also give an efficient algorithm for approximately computing an optimal extensive-form correlated equilibrium.
Our algorithm for approximately optimal EFCE is, to our knowledge, the first that achieves 3 desiderata simultaneously.
 arXiv  Detail & Related papers  (2024-12-22T09:12:05Z)
- Barriers to Welfare Maximization with No-Regret Learning [68.66209476382213]
 We prove lower bounds for computing a near-optimal $T$-sparse CCE.
In particular, we show that the inapproximability of maximum clique precludes attaining any non-trivial sparsity in time.
 arXiv  Detail & Related papers  (2024-11-04T00:34:56Z)
- Deterministic Policies for Constrained Reinforcement Learning in   Polynomial Time [1.223779595809275]
 Our algorithm efficiently computes near-optimal deterministic policies for constrained reinforcement learning problems.
Our work answers three open questions spanning two long-standing lines of research.
 arXiv  Detail & Related papers  (2024-05-23T05:27:51Z)
- Independent Learning in Constrained Markov Potential Games [19.083595175045073]
 Constrained Markov games offer a formal framework for modeling multi-agent reinforcement learning problems.
We propose an independent policy gradient algorithm for learning approximate constrained Nash equilibria.
 arXiv  Detail & Related papers  (2024-02-27T20:57:35Z)
- Hardness of Independent Learning and Sparse Equilibrium Computation in
  Markov Games [70.19141208203227]
 We consider the problem of decentralized multi-agent reinforcement learning in Markov games.
We show that no algorithm attains no-regret in general-sum games when executed independently by all players.
We show that our lower bounds hold even for seemingly easier setting in which all agents are controlled by a centralized algorithm.
 arXiv  Detail & Related papers  (2023-03-22T03:28:12Z)
- PAPAL: A Provable PArticle-based Primal-Dual ALgorithm for Mixed Nash   Equilibrium [58.26573117273626]
 We consider the non-AL equilibrium nonconptotic objective function in two-player zero-sum continuous games.
Our novel insights into the particle-based algorithms for continuous distribution strategies are presented.
 arXiv  Detail & Related papers  (2023-03-02T05:08:15Z)
- Efficiently Computing Nash Equilibria in Adversarial Team Markov Games [19.717850955051837]
 We introduce a class of games in which a team identically players is competing against an adversarial player.
This setting allows for a unifying treatment of zero-sum Markov games potential games.
Our main contribution is the first algorithm for computing stationary $epsilon$-approximate Nash equilibria in adversarial team Markov games.
 arXiv  Detail & Related papers  (2022-08-03T16:41:01Z)
- Provably Efficient Fictitious Play Policy Optimization for Zero-Sum
  Markov Games with Structured Transitions [145.54544979467872]
 We propose and analyze new fictitious play policy optimization algorithms for zero-sum Markov games with structured but unknown transitions.
We prove tight $widetildemathcalO(sqrtK)$ regret bounds after $K$ episodes in a two-agent competitive game scenario.
Our algorithms feature a combination of Upper Confidence Bound (UCB)-type optimism and fictitious play under the scope of simultaneous policy optimization.
 arXiv  Detail & Related papers  (2022-07-25T18:29:16Z)
- Efficient Algorithms for Planning with Participation Constraints [74.74967476995572]
 We consider the problem of planning with participation constraints introduced in [Zhang et al., 2022]
In this problem, a principal chooses actions in a decision process, resulting in separate utilities for the principal and the agent.
We provide the first-time exact algorithm for this problem for finite-horizon settings, where previously only an additive $varepsilon$-approximation was known.
 arXiv  Detail & Related papers  (2022-05-16T15:47:41Z)
- The Complexity of Markov Equilibrium in Stochastic Games [44.77547027158141]
 We show that computing approximate stationary Markov coarse correlated equilibria (CCE) in general-sum games is computationally intractable.
Our results stand in sharp contrast to normal-form games where exact CCEs are efficiently computable.
 arXiv  Detail & Related papers  (2022-04-08T10:51:01Z)
- Optimal Correlated Equilibria in General-Sum Extensive-Form Games:   Fixed-Parameter Algorithms, Hardness, and Two-Sided Column-Generation [78.48747645545944]
 We study the problem of finding optimal correlated equilibria of various sorts in extensive-form games.
We introduce a new algorithm for computing optimal equilibria in all three notions.
 arXiv  Detail & Related papers  (2022-03-14T15:21:18Z)
- Faster Algorithm and Sharper Analysis for Constrained Markov Decision
  Process [56.55075925645864]
 The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
 arXiv  Detail & Related papers  (2021-10-20T02:57:21Z)
- Efficient semidefinite-programming-based inference for binary and
  multi-class MRFs [83.09715052229782]
 We propose an efficient method for computing the partition function or MAP estimate in a pairwise MRF.
We extend semidefinite relaxations from the typical binary MRF to the full multi-class setting, and develop a compact semidefinite relaxation that can again be solved efficiently using the solver.
 arXiv  Detail & Related papers  (2020-12-04T15:36:29Z)
- Combining Deep Learning and Optimization for Security-Constrained
  Optimal Power Flow [94.24763814458686]
 Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
 arXiv  Detail & Related papers  (2020-07-14T12:38:21Z)
- Provably Efficient Model-Free Algorithm for MDPs with Peak Constraints [38.2783003051101]
 This paper considers the peak Constrained Markov Decision Process (PCMDP), where the agent chooses the policy to maximize total reward in the finite horizon as well as satisfy constraints at each epoch with probability 1.
We propose a model-free algorithm that converts PCMDP problem to an unconstrained problem and a Q-learning based approach is applied.
 arXiv  Detail & Related papers  (2020-03-11T23:23:29Z)
- Public Bayesian Persuasion: Being Almost Optimal and Almost Persuasive [57.47546090379434]
 We study the public persuasion problem in the general setting with: (i) arbitrary state spaces; (ii) arbitrary action spaces; (iii) arbitrary sender's utility functions.
We provide a quasi-polynomial time bi-criteria approximation algorithm for arbitrary public persuasion problems that, in specific settings, yields a QPTAS.
 arXiv  Detail & Related papers  (2020-02-12T18:59:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.