Related papers: Anytime-Constrained Multi-Agent Reinforcement Learning

Related papers

Efficiently Solving Turn-Taking Stochastic Games with Extensive-Form Correlation [52.16923999754027]
We give an algorithm for computing a Stackelberg extensive-form correlated equilibrium. We also give an efficient algorithm for approximately computing an optimal extensive-form correlated equilibrium. Our algorithm for approximately optimal EFCE is, to our knowledge, the first that achieves 3 desiderata simultaneously.
arXiv Detail & Related papers (2024-12-22T09:12:05Z)
Barriers to Welfare Maximization with No-Regret Learning [68.66209476382213]
We prove lower bounds for computing a near-optimal $T$-sparse CCE. In particular, we show that the inapproximability of maximum clique precludes attaining any non-trivial sparsity in time.
arXiv Detail & Related papers (2024-11-04T00:34:56Z)
Deterministic Policies for Constrained Reinforcement Learning in Polynomial Time [1.223779595809275]
Our algorithm efficiently computes near-optimal deterministic policies for constrained reinforcement learning problems. Our work answers three open questions spanning two long-standing lines of research.
arXiv Detail & Related papers (2024-05-23T05:27:51Z)
Independent Learning in Constrained Markov Potential Games [19.083595175045073]
Constrained Markov games offer a formal framework for modeling multi-agent reinforcement learning problems. We propose an independent policy gradient algorithm for learning approximate constrained Nash equilibria.
arXiv Detail & Related papers (2024-02-27T20:57:35Z)
Hardness of Independent Learning and Sparse Equilibrium Computation in Markov Games [70.19141208203227]
We consider the problem of decentralized multi-agent reinforcement learning in Markov games. We show that no algorithm attains no-regret in general-sum games when executed independently by all players. We show that our lower bounds hold even for seemingly easier setting in which all agents are controlled by a centralized algorithm.
arXiv Detail & Related papers (2023-03-22T03:28:12Z)
PAPAL: A Provable PArticle-based Primal-Dual ALgorithm for Mixed Nash Equilibrium [58.26573117273626]
We consider the non-AL equilibrium nonconptotic objective function in two-player zero-sum continuous games. Our novel insights into the particle-based algorithms for continuous distribution strategies are presented.
arXiv Detail & Related papers (2023-03-02T05:08:15Z)
Efficiently Computing Nash Equilibria in Adversarial Team Markov Games [19.717850955051837]
We introduce a class of games in which a team identically players is competing against an adversarial player. This setting allows for a unifying treatment of zero-sum Markov games potential games. Our main contribution is the first algorithm for computing stationary $epsilon$-approximate Nash equilibria in adversarial team Markov games.
arXiv Detail & Related papers (2022-08-03T16:41:01Z)
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions [145.54544979467872]
We propose and analyze new fictitious play policy optimization algorithms for zero-sum Markov games with structured but unknown transitions. We prove tight $widetildemathcalO(sqrtK)$ regret bounds after $K$ episodes in a two-agent competitive game scenario. Our algorithms feature a combination of Upper Confidence Bound (UCB)-type optimism and fictitious play under the scope of simultaneous policy optimization.
arXiv Detail & Related papers (2022-07-25T18:29:16Z)
Efficient Algorithms for Planning with Participation Constraints [74.74967476995572]
We consider the problem of planning with participation constraints introduced in [Zhang et al., 2022] In this problem, a principal chooses actions in a decision process, resulting in separate utilities for the principal and the agent. We provide the first-time exact algorithm for this problem for finite-horizon settings, where previously only an additive $varepsilon$-approximation was known.
arXiv Detail & Related papers (2022-05-16T15:47:41Z)
The Complexity of Markov Equilibrium in Stochastic Games [44.77547027158141]
We show that computing approximate stationary Markov coarse correlated equilibria (CCE) in general-sum games is computationally intractable. Our results stand in sharp contrast to normal-form games where exact CCEs are efficiently computable.
arXiv Detail & Related papers (2022-04-08T10:51:01Z)
Optimal Correlated Equilibria in General-Sum Extensive-Form Games: Fixed-Parameter Algorithms, Hardness, and Two-Sided Column-Generation [78.48747645545944]
We study the problem of finding optimal correlated equilibria of various sorts in extensive-form games. We introduce a new algorithm for computing optimal equilibria in all three notions.
arXiv Detail & Related papers (2022-03-14T15:21:18Z)
Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints. A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual. This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z)
Efficient semidefinite-programming-based inference for binary and multi-class MRFs [83.09715052229782]
We propose an efficient method for computing the partition function or MAP estimate in a pairwise MRF. We extend semidefinite relaxations from the typical binary MRF to the full multi-class setting, and develop a compact semidefinite relaxation that can again be solved efficiently using the solver.
arXiv Detail & Related papers (2020-12-04T15:36:29Z)
Combining Deep Learning and Optimization for Security-Constrained Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems. Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs. This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
Provably Efficient Model-Free Algorithm for MDPs with Peak Constraints [38.2783003051101]
This paper considers the peak Constrained Markov Decision Process (PCMDP), where the agent chooses the policy to maximize total reward in the finite horizon as well as satisfy constraints at each epoch with probability 1. We propose a model-free algorithm that converts PCMDP problem to an unconstrained problem and a Q-learning based approach is applied.
arXiv Detail & Related papers (2020-03-11T23:23:29Z)
Public Bayesian Persuasion: Being Almost Optimal and Almost Persuasive [57.47546090379434]
We study the public persuasion problem in the general setting with: (i) arbitrary state spaces; (ii) arbitrary action spaces; (iii) arbitrary sender's utility functions. We provide a quasi-polynomial time bi-criteria approximation algorithm for arbitrary public persuasion problems that, in specific settings, yields a QPTAS.
arXiv Detail & Related papers (2020-02-12T18:59:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.