Guessing Winning Policies in LTL Synthesis by Semantic Learning
- URL: http://arxiv.org/abs/2305.15109v1
- Date: Wed, 24 May 2023 12:57:53 GMT
- Title: Guessing Winning Policies in LTL Synthesis by Semantic Learning
- Authors: Jan Kretinsky, Tobias Meggendorfer, Maximilian Prokop, Sabine Rieder
- Abstract summary: We provide a learning-based technique for guessing a winning strategy in a parity game originating from an synthesis problem.
Not only can the guessed strategy be applied as best-effort in cases where the game's huge size prohibits rigorous approaches, but it can also increase the scalability of rigorous synthesis in several ways.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We provide a learning-based technique for guessing a winning strategy in a
parity game originating from an LTL synthesis problem. A cheaply obtained guess
can be useful in several applications. Not only can the guessed strategy be
applied as best-effort in cases where the game's huge size prohibits rigorous
approaches, but it can also increase the scalability of rigorous LTL synthesis
in several ways. Firstly, checking whether a guessed strategy is winning is
easier than constructing one. Secondly, even if the guess is wrong in some
places, it can be fixed by strategy iteration faster than constructing one from
scratch. Thirdly, the guess can be used in on-the-fly approaches to prioritize
exploration in the most fruitful directions.
In contrast to previous works, we (i)~reflect the highly structured logical
information in game's states, the so-called semantic labelling, coming from the
recent LTL-to-automata translations, and (ii)~learn to reflect it properly by
learning from previously solved games, bringing the solving process closer to
human-like reasoning.
Related papers
- LINC: A Neurosymbolic Approach for Logical Reasoning by Combining
Language Models with First-Order Logic Provers [60.009969929857704]
Logical reasoning is an important task for artificial intelligence with potential impacts on science, mathematics, and society.
In this work, we reformulating such tasks as modular neurosymbolic programming, which we call LINC.
We observe significant performance gains on FOLIO and a balanced subset of ProofWriter for three different models in nearly all experimental conditions we evaluate.
arXiv Detail & Related papers (2023-10-23T17:58:40Z) - On Optimal Strategies for Wordle and General Guessing Games [6.85316573653194]
We develop a method for finding optimal strategies for guessing games while avoiding an exhaustive search.
This work is developed to apply to any guessing game, but we use Wordle as an example to present concrete results.
arXiv Detail & Related papers (2023-05-16T02:30:10Z) - Generalised agent for solving higher board states of tic tac toe using
Reinforcement Learning [0.0]
This study is aimed at providing a generalized algorithm for higher board states of tic tac toe to make precise moves in a short period.
The idea is to pose the tic tac toe game as a well-posed learning problem.
The study and its results are promising, giving a high win to draw ratio with each epoch of training.
arXiv Detail & Related papers (2022-12-23T10:58:27Z) - Near-Optimal No-Regret Learning for General Convex Games [121.50979258049135]
We show that regret can be obtained for general convex and compact strategy sets.
Our dynamics are on an instantiation of optimistic follow-the-regularized-bounds over an appropriately emphlifted space.
Even in those special cases where prior results apply, our algorithm improves over the state-of-the-art regret.
arXiv Detail & Related papers (2022-06-17T12:58:58Z) - Strategizing against Learners in Bayesian Games [74.46970859427907]
We study repeated two-player games where one of the players, the learner, employs a no-regret learning strategy.
We consider general Bayesian games, where the payoffs of both the payoffs of both the learner and the learner could depend on the type.
arXiv Detail & Related papers (2022-05-17T18:10:25Z) - The best of both worlds: stochastic and adversarial episodic MDPs with
unknown transition [49.78053380710322]
We consider the best-of-both-worlds problem for learning an episodic Markov Decision Process through $T$ episodes.
Recent work by [Jin and Luo, 2020] achieves this goal when the fixed transition is known.
In this work, we resolve this open problem by using the same Follow-the-Regularized-Leader ($textFTRL$) framework together with a set of new techniques.
arXiv Detail & Related papers (2021-06-08T05:46:35Z) - First-Order Problem Solving through Neural MCTS based Reinforcement
Learning [0.0]
Many problems can be described using interpreted FOL statements and can be mapped into a semantic game.
We propose a general framework, Persephone, to map the FOL description of a problem to a semantic game.
Our goal for Persephone is to make it tabula-rasa, mapping a problem stated in interpreted FOL to a solution without human intervention.
arXiv Detail & Related papers (2021-01-11T19:54:06Z) - Complexity and Algorithms for Exploiting Quantal Opponents in Large
Two-Player Games [16.43565579998679]
Solution concepts of traditional game theory assume entirely rational players; therefore, their ability to exploit subrational opponents is limited.
This paper aims to analyze and propose scalable algorithms for computing effective and robust strategies against a quantal opponent in normal-form and extensive-form games.
arXiv Detail & Related papers (2020-09-30T09:14:56Z) - Follow the Perturbed Leader: Optimism and Fast Parallel Algorithms for
Smooth Minimax Games [33.9383996530254]
We consider the problem of online learning and its application to solving minimax games.
For the online learning problem, Follow Perturbed Leader is a widely perturbed oracle setting which computes the best response.
arXiv Detail & Related papers (2020-06-13T02:55:41Z) - Optimally Deceiving a Learning Leader in Stackelberg Games [123.14187606686006]
Recent results in the ML community have revealed that learning algorithms used to compute the optimal strategy for the leader to commit to in a Stackelberg game, are susceptible to manipulation by the follower.
This paper shows that it is always possible for the follower to compute (near-optimal) payoffs for various scenarios about the learning interaction between leader and follower.
arXiv Detail & Related papers (2020-06-11T16:18:21Z) - Excursion Search for Constrained Bayesian Optimization under a Limited
Budget of Failures [62.41541049302712]
We propose a novel decision maker grounded in control theory that controls the amount of risk we allow in the search as a function of a given budget of failures.
Our algorithm uses the failures budget more efficiently in a variety of optimization experiments, and generally achieves lower regret, than state-of-the-art methods.
arXiv Detail & Related papers (2020-05-15T09:54:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.