Alternating Good-for-MDP Automata
- URL: http://arxiv.org/abs/2205.03243v1
- Date: Fri, 6 May 2022 14:01:47 GMT
- Title: Alternating Good-for-MDP Automata
- Authors: Ernst Moritz Hahn and Mateo Perez and Sven Schewe and Fabio Somenzi
and Ashutosh Trivedi and Dominik Wojtczak
- Abstract summary: We show that it is possible to repair bad-for-MDPs (GFM) automata by using good-for-MDPs (GFM) B"uchi automata.
A translation to nondeterministic Rabin or B"uchi automata comes at an exponential cost, even without requiring the target automaton to be good-for-MDPs.
The surprising answer is that we have to pay significantly less when we instead expand the good-for-MDP property to alternating automata.
- Score: 4.429642479975602
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When omega-regular objectives were first proposed in model-free reinforcement
learning (RL) for controlling MDPs, deterministic Rabin automata were used in
an attempt to provide a direct translation from their transitions to scalar
values. While these translations failed, it has turned out that it is possible
to repair them by using good-for-MDPs (GFM) B\"uchi automata instead. These are
nondeterministic B\"uchi automata with a restricted type of nondeterminism,
albeit not as restricted as in good-for-games automata. Indeed, deterministic
Rabin automata have a pretty straightforward translation to such GFM automata,
which is bi-linear in the number of states and pairs. Interestingly, the same
cannot be said for deterministic Streett automata: a translation to
nondeterministic Rabin or B\"uchi automata comes at an exponential cost, even
without requiring the target automaton to be good-for-MDPs. Do we have to pay
more than that to obtain a good-for-MDP automaton? The surprising answer is
that we have to pay significantly less when we instead expand the good-for-MDP
property to alternating automata: like the nondeterministic GFM automata
obtained from deterministic Rabin automata, the alternating good-for-MDP
automata we produce from deterministic Streett automata are bi-linear in the
the size of the deterministic automaton and its index, and can therefore be
exponentially more succinct than minimal nondeterministic B\"uchi automata.
Related papers
- Simulating Weighted Automata over Sequences and Trees with Transformers [5.078561931628571]
We show that transformers can simulate weighted finite automata (WFAs), a class of models which subsumes DFAs, as well as weighted tree automata (WTA)
We prove these claims formally and provide upper bounds on the sizes of the transformer models needed as a function of the number of states the target automata.
arXiv Detail & Related papers (2024-03-12T21:54:34Z) - AutoMix: Automatically Mixing Language Models [62.51238143437967]
Large language models (LLMs) are now available from cloud API providers in various sizes and configurations.
We present Automix, an approach that strategically routes queries to larger LMs, based on the approximate correctness of outputs from a smaller LM.
arXiv Detail & Related papers (2023-10-19T17:57:39Z) - Deciding Differential Privacy of Online Algorithms with Multiple Variables [0.41248472494152805]
This paper generalizes an automaton model called DiP automata to describe such algorithms.
Our PSPACE algorithm also computes a value for $mathfrakD$ when the given automaton is differentially private.
arXiv Detail & Related papers (2023-09-12T22:03:01Z) - Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval [129.25914272977542]
RetoMaton is a weighted finite automaton built on top of the datastore.
Traversing this automaton at inference time, in parallel to the LM inference, reduces its perplexity.
arXiv Detail & Related papers (2022-01-28T21:38:56Z) - Symbolic Register Automata for Complex Event Recognition and Forecasting [70.7321040534471]
Symbolic Register Automata (SRA) is a combination of symbolic and register automata.
We show how SRA can be used in Complex Event Recognition in order to detect patterns upon streams of events.
We also show how the behavior of SRA, as they consume streams of events, can be given a probabilistic description.
arXiv Detail & Related papers (2021-10-08T11:04:51Z) - On (co-lex) Ordering Automata [2.800608984818919]
We show that a canonical, minimum-width, partially-ordered automaton accepting a language L can be exhibited.
Using H we prove that the width of the language can be effectively computed from the minimum automaton recognizing the language.
arXiv Detail & Related papers (2021-06-04T07:41:58Z) - Induction and Exploitation of Subgoal Automata for Reinforcement
Learning [75.55324974788475]
We present ISA, an approach for learning and exploiting subgoals in episodic reinforcement learning (RL) tasks.
ISA interleaves reinforcement learning with the induction of a subgoal automaton, an automaton whose edges are labeled by the task's subgoals.
A subgoal automaton also consists of two special states: a state indicating the successful completion of the task, and a state indicating that the task has finished without succeeding.
arXiv Detail & Related papers (2020-09-08T16:42:55Z) - AutoFIS: Automatic Feature Interaction Selection in Factorization Models
for Click-Through Rate Prediction [75.16836697734995]
We propose a two-stage algorithm called Automatic Feature Interaction Selection (AutoFIS)
AutoFIS can automatically identify important feature interactions for factorization models with computational cost just equivalent to training the target model to convergence.
AutoFIS has been deployed onto the training platform of Huawei App Store recommendation service.
arXiv Detail & Related papers (2020-03-25T06:53:54Z) - Learning Autoencoders with Relational Regularization [89.53065887608088]
A new framework is proposed for learning autoencoders of data distributions.
We minimize the discrepancy between the model and target distributions, with a emphrelational regularization
We implement the framework with two scalable algorithms, making it applicable for both probabilistic and deterministic autoencoders.
arXiv Detail & Related papers (2020-02-07T17:27:30Z) - Reward Shaping for Reinforcement Learning with Omega-Regular Objectives [0.0]
We exploit good-for-MDPs automata for model free reinforcement learning.
The drawback of this translation is that the rewards are, on average, reaped very late.
We devise a new reward shaping approach that overcomes this issue.
arXiv Detail & Related papers (2020-01-16T18:22:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.