Related papers: Guessing Winning Policies in LTL Synthesis by Semantic Learning

Guessing Winning Policies in LTL Synthesis by Semantic Learning

URL: http://arxiv.org/abs/2305.15109v1
Date: Wed, 24 May 2023 12:57:53 GMT
Title: Guessing Winning Policies in LTL Synthesis by Semantic Learning
Authors: Jan Kretinsky, Tobias Meggendorfer, Maximilian Prokop, Sabine Rieder
Abstract summary: We provide a learning-based technique for guessing a winning strategy in a parity game originating from an synthesis problem. Not only can the guessed strategy be applied as best-effort in cases where the game's huge size prohibits rigorous approaches, but it can also increase the scalability of rigorous synthesis in several ways.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We provide a learning-based technique for guessing a winning strategy in a parity game originating from an LTL synthesis problem. A cheaply obtained guess can be useful in several applications. Not only can the guessed strategy be applied as best-effort in cases where the game's huge size prohibits rigorous approaches, but it can also increase the scalability of rigorous LTL synthesis in several ways. Firstly, checking whether a guessed strategy is winning is easier than constructing one. Secondly, even if the guess is wrong in some places, it can be fixed by strategy iteration faster than constructing one from scratch. Thirdly, the guess can be used in on-the-fly approaches to prioritize exploration in the most fruitful directions. In contrast to previous works, we (i)~reflect the highly structured logical information in game's states, the so-called semantic labelling, coming from the recent LTL-to-automata translations, and (ii)~learn to reflect it properly by learning from previously solved games, bringing the solving process closer to human-like reasoning.

Related papers

Exposing the Copycat Problem of Imitation-based Planner: A Novel Closed-Loop Simulator, Causal Benchmark and Joint IL-RL Baseline [49.51385135697656]
Within machine learning-based planning, imitation learning (IL) is a common algorithm. It primarily learns driving policies directly from supervised trajectory data. It remains challenging to determine if the learned policy truly understands fundamental driving principles. This work proposes a novel closed-loop simulator supporting both imitation and reinforcement learning.
arXiv Detail & Related papers (2025-04-20T18:51:26Z)
Computational Intractability of Strategizing against Online Learners [7.225039292056237]
We show that a standard no-regret algorithm can compute a near-optimal strategy against a learner. This establishes a fundamental computational barrier to finding optimal strategies in general game-theoretic settings.
arXiv Detail & Related papers (2025-03-06T08:28:50Z)
Simple and Provable Scaling Laws for the Test-Time Compute of Large Language Models [70.07661254213181]
We propose two principled algorithms for the test-time compute of large language models. We prove theoretically that the failure probability of one algorithm decays to zero exponentially as its test-time compute grows.
arXiv Detail & Related papers (2024-11-29T05:29:47Z)
LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers [60.009969929857704]
Logical reasoning is an important task for artificial intelligence with potential impacts on science, mathematics, and society. In this work, we reformulating such tasks as modular neurosymbolic programming, which we call LINC. We observe significant performance gains on FOLIO and a balanced subset of ProofWriter for three different models in nearly all experimental conditions we evaluate.
arXiv Detail & Related papers (2023-10-23T17:58:40Z)
On Optimal Strategies for Wordle and General Guessing Games [6.85316573653194]
We develop a method for finding optimal strategies for guessing games while avoiding an exhaustive search. This work is developed to apply to any guessing game, but we use Wordle as an example to present concrete results.
arXiv Detail & Related papers (2023-05-16T02:30:10Z)
Generalised agent for solving higher board states of tic tac toe using Reinforcement Learning [0.0]
This study is aimed at providing a generalized algorithm for higher board states of tic tac toe to make precise moves in a short period. The idea is to pose the tic tac toe game as a well-posed learning problem. The study and its results are promising, giving a high win to draw ratio with each epoch of training.
arXiv Detail & Related papers (2022-12-23T10:58:27Z)
Near-Optimal No-Regret Learning for General Convex Games [121.50979258049135]
We show that regret can be obtained for general convex and compact strategy sets. Our dynamics are on an instantiation of optimistic follow-the-regularized-bounds over an appropriately emphlifted space. Even in those special cases where prior results apply, our algorithm improves over the state-of-the-art regret.
arXiv Detail & Related papers (2022-06-17T12:58:58Z)
Strategizing against Learners in Bayesian Games [74.46970859427907]
We study repeated two-player games where one of the players, the learner, employs a no-regret learning strategy. We consider general Bayesian games, where the payoffs of both the payoffs of both the learner and the learner could depend on the type.
arXiv Detail & Related papers (2022-05-17T18:10:25Z)
The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition [49.78053380710322]
We consider the best-of-both-worlds problem for learning an episodic Markov Decision Process through $T$ episodes. Recent work by [Jin and Luo, 2020] achieves this goal when the fixed transition is known. In this work, we resolve this open problem by using the same Follow-the-Regularized-Leader ($textFTRL$) framework together with a set of new techniques.
arXiv Detail & Related papers (2021-06-08T05:46:35Z)
First-Order Problem Solving through Neural MCTS based Reinforcement Learning [0.0]
Many problems can be described using interpreted FOL statements and can be mapped into a semantic game. We propose a general framework, Persephone, to map the FOL description of a problem to a semantic game. Our goal for Persephone is to make it tabula-rasa, mapping a problem stated in interpreted FOL to a solution without human intervention.
arXiv Detail & Related papers (2021-01-11T19:54:06Z)
Complexity and Algorithms for Exploiting Quantal Opponents in Large Two-Player Games [16.43565579998679]
Solution concepts of traditional game theory assume entirely rational players; therefore, their ability to exploit subrational opponents is limited. This paper aims to analyze and propose scalable algorithms for computing effective and robust strategies against a quantal opponent in normal-form and extensive-form games.
arXiv Detail & Related papers (2020-09-30T09:14:56Z)
Follow the Perturbed Leader: Optimism and Fast Parallel Algorithms for Smooth Minimax Games [33.9383996530254]
We consider the problem of online learning and its application to solving minimax games. For the online learning problem, Follow Perturbed Leader is a widely perturbed oracle setting which computes the best response.
arXiv Detail & Related papers (2020-06-13T02:55:41Z)
Optimally Deceiving a Learning Leader in Stackelberg Games [123.14187606686006]
Recent results in the ML community have revealed that learning algorithms used to compute the optimal strategy for the leader to commit to in a Stackelberg game, are susceptible to manipulation by the follower. This paper shows that it is always possible for the follower to compute (near-optimal) payoffs for various scenarios about the learning interaction between leader and follower.
arXiv Detail & Related papers (2020-06-11T16:18:21Z)
Excursion Search for Constrained Bayesian Optimization under a Limited Budget of Failures [62.41541049302712]
We propose a novel decision maker grounded in control theory that controls the amount of risk we allow in the search as a function of a given budget of failures. Our algorithm uses the failures budget more efficiently in a variety of optimization experiments, and generally achieves lower regret, than state-of-the-art methods.
arXiv Detail & Related papers (2020-05-15T09:54:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.