Related papers: The Alternating-Time \mu-Calculus With Disjunctive Explicit Strategies

The Alternating-Time \mu-Calculus With Disjunctive Explicit Strategies

URL: http://arxiv.org/abs/2305.18795v1
Date: Tue, 30 May 2023 07:16:59 GMT
Title: The Alternating-Time \mu-Calculus With Disjunctive Explicit Strategies
Authors: Merlin Humml, Lutz Schr\"oder, Dirk Pattinson
Abstract summary: We study the strategic abilities of coalitions of agents in concurrent game structures. Key ingredient of the logic are path quantifiers specifying that some coalition of agents has a joint strategy to enforce a given goal. We extend ATLES with fixpoint operators and strategy disjunction, arriving at the alternating-time $mu$-calculus with explicit strategies.
Score: 1.7725414095035827
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Alternating-time temporal logic (ATL) and its extensions, including the alternating-time $\mu$-calculus (AMC), serve the specification of the strategic abilities of coalitions of agents in concurrent game structures. The key ingredient of the logic are path quantifiers specifying that some coalition of agents has a joint strategy to enforce a given goal. This basic setup has been extended to let some of the agents (revocably) commit to using certain named strategies, as in ATL with explicit strategies (ATLES). In the present work, we extend ATLES with fixpoint operators and strategy disjunction, arriving at the alternating-time $\mu$-calculus with disjunctive explicit strategies (AMCDES), which allows for a more flexible formulation of temporal properties (e.g. fairness) and, through strategy disjunction, a form of controlled nondeterminism in commitments. Our main result is an ExpTime upper bound for satisfiability checking (which is thus ExpTime-complete). We also prove upper bounds QP (quasipolynomial time) and NP $\cap$ coNP for model checking under fixed interpretations of explicit strategies, and NP under open interpretation. Our key technical tool is a treatment of the AMCDES within the generic framework of coalgebraic logic, which in particular reduces the analysis of most reasoning tasks to the treatment of a very simple one-step logic featuring only propositional operators and next-step operators without nesting; we give a new model construction principle for this one-step logic that relies on a set-valued variant of first-order resolution.

Related papers

PixelThink: Towards Efficient Chain-of-Pixel Reasoning [70.32510083790069]
PixelThink is a simple yet effective scheme that integrates externally estimated task difficulty and internally measured model uncertainty.<n>It learns to compress reasoning length in accordance with scene complexity and predictive confidence.<n> Experimental results demonstrate that the proposed approach improves both reasoning efficiency and overall segmentation performance.
arXiv Detail & Related papers (2025-05-29T17:55:49Z)
T$^2$: An Adaptive Test-Time Scaling Strategy for Contextual Question Answering [49.5489716597489]
We present T$2$: Think-to-Think, a novel framework that dynamically adapts reasoning depth based on question complexity.<n>T$2$ works through four key steps: decomposing questions into structural elements, generating similar examples with candidate reasoning strategies, evaluating these strategies against multiple criteria, and applying the most appropriate strategy to the original question.
arXiv Detail & Related papers (2025-05-23T03:18:02Z)
Syzygy of Thoughts: Improving LLM CoT with the Minimal Free Resolution [59.39066657300045]
Chain-of-Thought (CoT) prompting enhances the reasoning of large language models (LLMs) by decomposing problems into sequential steps. We propose Syzygy of Thoughts (SoT)-a novel framework that extends CoT by introducing auxiliary, interrelated reasoning paths. SoT captures deeper logical dependencies, enabling more robust and structured problem-solving.
arXiv Detail & Related papers (2025-04-13T13:35:41Z)
PlanGEN: A Multi-Agent Framework for Generating Planning and Reasoning Trajectories for Complex Problem Solving [89.60370366013142]
We propose PlanGEN, a model-agnostic and easily scalable agent framework with three key components: constraint, verification, and selection agents. Specifically, our approach proposes constraint-guided iterative verification to enhance performance of inference-time algorithms.
arXiv Detail & Related papers (2025-02-22T06:21:56Z)
LLM-Generated Heuristics for AI Planning: Do We Even Need Domain-Independence Anymore? [87.71321254733384]
Large language models (LLMs) can generate planning approaches tailored to specific planning problems. LLMs can achieve state-of-the-art performance on some standard IPC domains. We discuss whether these results signify a paradigm shift and how they can complement existing planning approaches.
arXiv Detail & Related papers (2025-01-30T22:21:12Z)
Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning [62.81324245896717]
We introduce an exploration-agnostic algorithm, called C-PG, which exhibits global last-ite convergence guarantees under (weak) gradient domination assumptions. We numerically validate our algorithms on constrained control problems, and compare them with state-of-the-art baselines.
arXiv Detail & Related papers (2024-07-15T14:54:57Z)
Break the Chain: Large Language Models Can be Shortcut Reasoners [18.047917626825548]
Chain-of-Thought (CoT) reasoning utilize complex modules but are hampered by high token consumption, limited applicability, and challenges in thinking. This paper conducts a critical evaluation of CoT prompting, extending beyond arithmetic to include complex logical and commonsense reasoning tasks. We propose the integration of human-likes and shortcuts into language models (LMs) through "break the chain" strategies.
arXiv Detail & Related papers (2024-06-04T14:02:53Z)
Hyper Strategy Logic [4.726777092009553]
Strategy logic (SL) is a powerful temporal logic that enables strategic reasoning in multi-agent systems. We present Hyper Strategy Logic (HyperSL), a strategy logic where the outcome of multiple strategy profiles can be compared w.r.t. a hyperproperty. We show that HyperSL can capture important properties that cannot be expressed in SL, including non-interference, quantitative Nash equilibria, optimal adversarial planning, and reasoning under imperfect information.
arXiv Detail & Related papers (2024-03-20T16:47:53Z)
Natural Strategic Ability in Stochastic Multi-Agent Systems [6.685412769221565]
We consider the probabilistic temporal logics PATL and PATL* under natural strategies. We show that, in MAS, NatPATL model-checking is NPcomplete when the active coalition is restricted to deterministic strategies.
arXiv Detail & Related papers (2024-01-22T18:04:26Z)
Optimal Control of Logically Constrained Partially Observable and Multi-Agent Markov Decision Processes [5.471640959988549]
We first introduce an optimal control theory for partially observable Markov decision processes. We provide a structured methodology for synthesizing policies that maximize a cumulative reward. We then build on this approach to design an optimal control framework for logically constrained multi-agent settings.
arXiv Detail & Related papers (2023-05-24T05:15:36Z)
Policy Optimization with Linear Temporal Logic Constraints [37.27882290236194]
We study the problem of policy optimization with linear temporal logic constraints. We develop a model-based approach that enjoys a sample complexity analysis for guaranteeing both task satisfaction and cost optimality.
arXiv Detail & Related papers (2022-06-20T02:58:02Z)
Revisiting GANs by Best-Response Constraint: Perspective, Methodology, and Application [49.66088514485446]
Best-Response Constraint (BRC) is a general learning framework to explicitly formulate the potential dependency of the generator on the discriminator. We show that even with different motivations and formulations, a variety of existing GANs ALL can be uniformly improved by our flexible BRC methodology.
arXiv Detail & Related papers (2022-05-20T12:42:41Z)
Understanding Robust Generalization in Learning Regular Languages [85.95124524975202]
We study robust generalization in the context of using recurrent neural networks to learn regular languages. We propose a compositional strategy to address this. We theoretically prove that the compositional strategy generalizes significantly better than the end-to-end strategy.
arXiv Detail & Related papers (2022-02-20T02:50:09Z)
Universal Online Convex Optimization Meets Second-order Bounds [74.0120666722487]
We propose a simple strategy for universal online convex optimization. The key idea is to construct a set of experts to process the original online functions, and deploy a meta-algorithm over the linearized losses. In this way, we can plug in off-the-shelf online solvers as black-box experts to deliver problem-dependent regret bounds.
arXiv Detail & Related papers (2021-05-08T11:43:49Z)
Multi-Agent Reinforcement Learning with Temporal Logic Specifications [65.79056365594654]
We study the problem of learning to satisfy temporal logic specifications with a group of agents in an unknown environment. We develop the first multi-agent reinforcement learning technique for temporal logic specifications. We provide correctness and convergence guarantees for our main algorithm.
arXiv Detail & Related papers (2021-02-01T01:13:03Z)
Goal Kernel Planning: Linearly-Solvable Non-Markovian Policies for Logical Tasks with Goal-Conditioned Options [54.40780660868349]
We introduce a compositional framework called Linearly-Solvable Goal Kernel Dynamic Programming (LS-GKDP)<n>LS-GKDP combines the Linearly-Solvable Markov Decision Process (LMDP) formalism with the Options Framework of Reinforcement Learning.<n>We show how an LMDP with a goal kernel enables the efficient optimization of meta-policies in a lower-dimensional subspace defined by the task grounding.
arXiv Detail & Related papers (2020-07-06T05:13:20Z)
Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs) The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.