Probabilistic Planning with Preferences over Temporal Goals
- URL: http://arxiv.org/abs/2103.14489v1
- Date: Fri, 26 Mar 2021 14:26:40 GMT
- Title: Probabilistic Planning with Preferences over Temporal Goals
- Authors: Jie Fu
- Abstract summary: We present a formal language for specifying qualitative preferences over temporal goals and a preference-based planning method in systems.
Using automata-theoretic modeling, the proposed specification allows us to express preferences over different sets of outcomes, where each outcome describes a set of temporal sequences of subgoals.
We define the value of preference satisfaction given a process over possible outcomes and develop an algorithm for time-constrained probabilistic planning in labeled Markov decision processes.
- Score: 21.35365462532568
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We present a formal language for specifying qualitative preferences over
temporal goals and a preference-based planning method in stochastic systems.
Using automata-theoretic modeling, the proposed specification allows us to
express preferences over different sets of outcomes, where each outcome
describes a set of temporal sequences of subgoals. We define the value of
preference satisfaction given a stochastic process over possible outcomes and
develop an algorithm for time-constrained probabilistic planning in labeled
Markov decision processes where an agent aims to maximally satisfy its
preference formula within a pre-defined finite time duration. We present
experimental results using a stochastic gridworld example and discuss possible
extensions of the proposed preference model.
Related papers
- An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences.
We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration.
Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z) - Preference-Based Planning in Stochastic Environments: From Partially-Ordered Temporal Goals to Most Preferred Policies [25.731912021122287]
We consider systems modeled as Markov decision processes, given a partially ordered preference over a set of temporally extended goals.
To plan with the partially ordered preference, we introduce order theory to map a preference over temporal goals to a preference over policies for the MDP.
A most preferred policy under a ordering induces a nondominated probability distribution over the finite paths in the MDP.
arXiv Detail & Related papers (2024-03-27T02:46:09Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Differentiating Metropolis-Hastings to Optimize Intractable Densities [51.16801956665228]
We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers.
We apply gradient-based optimization to objectives expressed as expectations over intractable target densities.
arXiv Detail & Related papers (2023-06-13T17:56:02Z) - Probabilistic Planning with Prioritized Preferences over Temporal Logic
Objectives [26.180359884973566]
We study temporal planning in probabilistic environments, modeled as labeled Markov decision processes (MDPs)
This paper introduces a new specification language, termed prioritized qualitative choice linear temporal logic on finite traces.
We formulate and solve a problem of computing an optimal policy that minimizes the expected score of dissatisfaction given user preferences.
arXiv Detail & Related papers (2023-04-23T13:03:27Z) - Probabilistic Planning with Partially Ordered Preferences over Temporal
Goals [22.77805882908817]
We study planning in Markov decision processes (MDPs) with preferences over temporally extended goals.
We introduce a variant of deterministic finite automaton, referred to as a preference DFA, for specifying the user's preferences over temporally extended goals.
We prove that a weak-stochastic nondominated policy given the preference specification is optimal in the constructed multi-objective MDP.
arXiv Detail & Related papers (2022-09-25T17:13:24Z) - Probabilistic Conformal Prediction Using Conditional Random Samples [73.26753677005331]
PCP is a predictive inference algorithm that estimates a target variable by a discontinuous predictive set.
It is efficient and compatible with either explicit or implicit conditional generative models.
arXiv Detail & Related papers (2022-06-14T03:58:03Z) - Planning with Diffusion for Flexible Behavior Synthesis [125.24438991142573]
We consider what it would look like to fold as much of the trajectory optimization pipeline as possible into the modeling problem.
The core of our technical approach lies in a diffusion probabilistic model that plans by iteratively denoising trajectories.
arXiv Detail & Related papers (2022-05-20T07:02:03Z) - Sequential Learning-based IaaS Composition [0.11470070927586014]
Decision variables are included in the temporal conditional preference networks (TempCP-net)
The global preference ranking of a set of requests is computed using a textitk-d tree indexing based temporal similarity measure approach.
We design the on-policy based sequential selection learning approach that applies the length of request to accept or reject requests in a composition.
arXiv Detail & Related papers (2021-02-24T23:16:01Z) - Adaptive Sequential Design for a Single Time-Series [2.578242050187029]
We learn an optimal, unknown choice of the controlled components of a design in order to optimize the expected outcome.
We adapt the randomization mechanism for future time-point experiments based on the data collected on the individual over time.
arXiv Detail & Related papers (2021-01-29T22:51:45Z) - Stochastic batch size for adaptive regularization in deep network
optimization [63.68104397173262]
We propose a first-order optimization algorithm incorporating adaptive regularization applicable to machine learning problems in deep learning framework.
We empirically demonstrate the effectiveness of our algorithm using an image classification task based on conventional network models applied to commonly used benchmark datasets.
arXiv Detail & Related papers (2020-04-14T07:54:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.