Natural Strategic Ability in Stochastic Multi-Agent Systems
- URL: http://arxiv.org/abs/2401.12170v1
- Date: Mon, 22 Jan 2024 18:04:26 GMT
- Title: Natural Strategic Ability in Stochastic Multi-Agent Systems
- Authors: Rapha\"el Berthon, Joost-Pieter Katoen, Munyque Mittelmann, Aniello
Murano
- Abstract summary: We consider the probabilistic temporal logics PATL and PATL* under natural strategies.
We show that, in MAS, NatPATL model-checking is NPcomplete when the active coalition is restricted to deterministic strategies.
- Score: 6.685412769221565
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Strategies synthesized using formal methods can be complex and often require
infinite memory, which does not correspond to the expected behavior when trying
to model Multi-Agent Systems (MAS). To capture such behaviors, natural
strategies are a recently proposed framework striking a balance between the
ability of agents to strategize with memory and the model-checking complexity,
but until now has been restricted to fully deterministic settings. For the
first time, we consider the probabilistic temporal logics PATL and PATL* under
natural strategies (NatPATL and NatPATL*, resp.). As main result we show that,
in stochastic MAS, NatPATL model-checking is NP-complete when the active
coalition is restricted to deterministic strategies. We also give a 2NEXPTIME
complexity result for NatPATL* with the same restriction. In the unrestricted
case, we give an EXPSPACE complexity for NatPATL and 3EXPSPACE complexity for
NatPATL*.
Related papers
- REBEL: Reinforcement Learning via Regressing Relative Rewards [59.68420022466047]
We propose REBEL, a minimalist RL algorithm for the era of generative models.
In theory, we prove that fundamental RL algorithms like Natural Policy Gradient can be seen as variants of REBEL.
We find that REBEL provides a unified approach to language modeling and image generation with stronger or similar performance as PPO and DPO.
arXiv Detail & Related papers (2024-04-25T17:20:45Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - Towards Instance-Optimality in Online PAC Reinforcement Learning [28.156332484814616]
We propose the first instance-dependent lower bound on the sample complexity required for the PAC identification of a near-optimal policy.
We demonstrate that the sample complexity of the PEDEL algorithm of citeWagenmaker22linearMDP closely approaches this lower bound.
arXiv Detail & Related papers (2023-10-31T19:26:36Z) - The Alternating-Time \mu-Calculus With Disjunctive Explicit Strategies [1.7725414095035827]
We study the strategic abilities of coalitions of agents in concurrent game structures.
Key ingredient of the logic are path quantifiers specifying that some coalition of agents has a joint strategy to enforce a given goal.
We extend ATLES with fixpoint operators and strategy disjunction, arriving at the alternating-time $mu$-calculus with explicit strategies.
arXiv Detail & Related papers (2023-05-30T07:16:59Z) - PAC Reinforcement Learning for Predictive State Representations [60.00237613646686]
We study online Reinforcement Learning (RL) in partially observable dynamical systems.
We focus on the Predictive State Representations (PSRs) model, which is an expressive model that captures other well-known models.
We develop a novel model-based algorithm for PSRs that can learn a near optimal policy in sample complexity scalingly.
arXiv Detail & Related papers (2022-07-12T17:57:17Z) - Semantic Probabilistic Layers for Neuro-Symbolic Learning [83.25785999205932]
We design a predictive layer for structured-output prediction (SOP)
It can be plugged into any neural network guaranteeing its predictions are consistent with a set of predefined symbolic constraints.
Our Semantic Probabilistic Layer (SPL) can model intricate correlations, and hard constraints, over a structured output space.
arXiv Detail & Related papers (2022-06-01T12:02:38Z) - Computing Complexity-aware Plans Using Kolmogorov Complexity [0.9137554315375922]
We introduce complexity-aware planning for finite-horizon deterministic finite automata with rewards as outputs.
We present two algorithms obtaining low-complexity policies, where the first algorithm obtains a low-complexity optimal policy.
We evaluate the algorithms on a simple navigation task for a mobile robot, where our algorithms yield low-complexity policies that concur with intuition.
arXiv Detail & Related papers (2021-09-21T16:25:04Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - On the Sample Complexity of Reinforcement Learning with Policy Space
Generalization [21.879621917722613]
We study the optimal sample complexity in large-scale Reinforcement Learning (RL) problems with policy space generalization.
Existing results show that without a generalization model, the sample complexity of an RL algorithm will inevitably depend on the cardinalities of state space and action space.
This paper proposes a new notion of eluder dimension for the policy space, which characterizes the intrinsic complexity of policy learning.
arXiv Detail & Related papers (2020-08-17T14:26:18Z) - Learning with Safety Constraints: Sample Complexity of Reinforcement
Learning for Constrained MDPs [13.922754427601491]
We characterize the relationship between safety constraints and the number of samples needed to ensure a desired level of accuracy.
Our main finding is that compared to the best known bounds of the unconstrained regime, the sample of constrained RL algorithms are increased by a factor that is logarithmic in the number of constraints.
arXiv Detail & Related papers (2020-08-01T18:17:08Z) - Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal
Sample Complexity [67.02490430380415]
We show that model-based MARL achieves a sample complexity of $tilde O(|S||B|(gamma)-3epsilon-2)$ for finding the Nash equilibrium (NE) value up to some $epsilon$ error.
We also show that such a sample bound is minimax-optimal (up to logarithmic factors) if the algorithm is reward-agnostic, where the algorithm queries state transition samples without reward knowledge.
arXiv Detail & Related papers (2020-07-15T03:25:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.