Automata Learning of Preferences over Temporal Logic Formulas from Pairwise Comparisons
- URL: http://arxiv.org/abs/2505.18030v1
- Date: Fri, 23 May 2025 15:35:39 GMT
- Title: Automata Learning of Preferences over Temporal Logic Formulas from Pairwise Comparisons
- Authors: Hazhar Rahmani, Jie Fu,
- Abstract summary: This paper considers a class of preference inference problems where the user's unknown preference is represented by a preorder.<n>We first show that a preference relation over temporal goals can be modeled by a Preference Deterministic Finite Automaton.<n>We develop an algorithm that guarantees to learn, given a characteristic sample, the minimal PDFA equivalent to the true PDFA from which the sample is drawn.
- Score: 28.920090391513
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many preference elicitation algorithms consider preference over propositional logic formulas or items with different attributes. In sequential decision making, a user's preference can be a preorder over possible outcomes, each of which is a temporal sequence of events. This paper considers a class of preference inference problems where the user's unknown preference is represented by a preorder over regular languages (sets of temporal sequences), referred to as temporal goals. Given a finite set of pairwise comparisons between finite words, the objective is to learn both the set of temporal goals and the preorder over these goals. We first show that a preference relation over temporal goals can be modeled by a Preference Deterministic Finite Automaton (PDFA), which is a deterministic finite automaton augmented with a preorder over acceptance conditions. The problem of preference inference reduces to learning the PDFA. This problem is shown to be computationally challenging, with the problem of determining whether there exists a PDFA of size smaller than a given integer $k$, consistent with the sample, being NP-Complete. We formalize the properties of characteristic samples and develop an algorithm that guarantees to learn, given a characteristic sample, the minimal PDFA equivalent to the true PDFA from which the sample is drawn. We present the method through a running example and provide detailed analysis using a robotic motion planning problem.
Related papers
- Inferring from Logits: Exploring Best Practices for Decoding-Free Generative Candidate Selection [37.54564513506548]
Generative Language Models rely on autoregressive decoding to produce the output sequence token by token.<n>We introduce an evaluation of a comprehensive collection of decoding-free candidate selection approaches on a comprehensive set of tasks.
arXiv Detail & Related papers (2025-01-28T23:21:28Z) - FastGAS: Fast Graph-based Annotation Selection for In-Context Learning [53.17606395275021]
In-context learning (ICL) empowers large language models (LLMs) to tackle new tasks by using a series of training instances as prompts.
Existing methods have proposed to select a subset of unlabeled examples for annotation.
We propose a graph-based selection method, FastGAS, designed to efficiently identify high-quality instances.
arXiv Detail & Related papers (2024-06-06T04:05:54Z) - Preference-Based Planning in Stochastic Environments: From Partially-Ordered Temporal Goals to Most Preferred Policies [25.731912021122287]
We consider systems modeled as Markov decision processes, given a partially ordered preference over a set of temporally extended goals.
To plan with the partially ordered preference, we introduce order theory to map a preference over temporal goals to a preference over policies for the MDP.
A most preferred policy under a ordering induces a nondominated probability distribution over the finite paths in the MDP.
arXiv Detail & Related papers (2024-03-27T02:46:09Z) - Learning to Select and Rank from Choice-Based Feedback: A Simple Nested Approach [10.293894471295205]
We study a ranking and selection problem of learning from choice-based feedback with dynamic assortments.<n>We present novel and simple algorithms for both learning goals.
arXiv Detail & Related papers (2023-07-13T05:05:30Z) - Probabilistic Planning with Prioritized Preferences over Temporal Logic
Objectives [26.180359884973566]
We study temporal planning in probabilistic environments, modeled as labeled Markov decision processes (MDPs)
This paper introduces a new specification language, termed prioritized qualitative choice linear temporal logic on finite traces.
We formulate and solve a problem of computing an optimal policy that minimizes the expected score of dissatisfaction given user preferences.
arXiv Detail & Related papers (2023-04-23T13:03:27Z) - Probabilistic Planning with Partially Ordered Preferences over Temporal
Goals [22.77805882908817]
We study planning in Markov decision processes (MDPs) with preferences over temporally extended goals.
We introduce a variant of deterministic finite automaton, referred to as a preference DFA, for specifying the user's preferences over temporally extended goals.
We prove that a weak-stochastic nondominated policy given the preference specification is optimal in the constructed multi-objective MDP.
arXiv Detail & Related papers (2022-09-25T17:13:24Z) - Machine Learning for Online Algorithm Selection under Censored Feedback [71.6879432974126]
In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms.
For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime.
In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem.
We adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon.
arXiv Detail & Related papers (2021-09-13T18:10:52Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - Non-Adaptive Adaptive Sampling on Turnstile Streams [57.619901304728366]
We give the first relative-error algorithms for column subset selection, subspace approximation, projective clustering, and volume on turnstile streams that use space sublinear in $n$.
Our adaptive sampling procedure has a number of applications to various data summarization problems that either improve state-of-the-art or have only been previously studied in the more relaxed row-arrival model.
arXiv Detail & Related papers (2020-04-23T05:00:21Z) - Ranking a set of objects: a graph based least-square approach [70.7866286425868]
We consider the problem of ranking $N$ objects starting from a set of noisy pairwise comparisons provided by a crowd of equal workers.
We propose a class of non-adaptive ranking algorithms that rely on a least-squares intrinsic optimization criterion for the estimation of qualities.
arXiv Detail & Related papers (2020-02-26T16:19:09Z) - Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback.
We devise an algorithm with a minimal cluster recovery error rate.
For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.