Qualitative Analysis of $ω$-Regular Objectives on Robust MDPs
- URL: http://arxiv.org/abs/2505.04539v1
- Date: Wed, 07 May 2025 16:15:40 GMT
- Title: Qualitative Analysis of $ω$-Regular Objectives on Robust MDPs
- Authors: Ali Asadi, Krishnendu Chatterjee, Ehsan Kafshdar Goharshady, Mehrdad Karrabi, Ali Shafiee,
- Abstract summary: We study the qualitative problem for reachability and parity objectives on Robust Markov Decision Processes.<n>We first present efficient algorithms with oracle access to uncertainty sets that solve these problems.<n>We then report experimental results demonstrating the effectiveness of our oracle-based approach.
- Score: 5.129009542652635
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robust Markov Decision Processes (RMDPs) generalize classical MDPs that consider uncertainties in transition probabilities by defining a set of possible transition functions. An objective is a set of runs (or infinite trajectories) of the RMDP, and the value for an objective is the maximal probability that the agent can guarantee against the adversarial environment. We consider (a) reachability objectives, where given a target set of states, the goal is to eventually arrive at one of them; and (b) parity objectives, which are a canonical representation for $\omega$-regular objectives. The qualitative analysis problem asks whether the objective can be ensured with probability 1. In this work, we study the qualitative problem for reachability and parity objectives on RMDPs without making any assumption over the structures of the RMDPs, e.g., unichain or aperiodic. Our contributions are twofold. We first present efficient algorithms with oracle access to uncertainty sets that solve qualitative problems of reachability and parity objectives. We then report experimental results demonstrating the effectiveness of our oracle-based approach on classical RMDP examples from the literature scaling up to thousands of states.
Related papers
- Probabilistic Active Goal Recognition [12.583886275683302]
We adopt a probabilistic framework for Active Goal Recognition.<n>We propose an integrated solution that combines a joint belief update mechanism with a Monte Carlo Tree Search algorithm.<n>We show that our joint belief update significantly outperforms passive goal recognition.
arXiv Detail & Related papers (2025-07-29T14:22:29Z) - Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time [52.230936493691985]
We propose SITAlign, an inference framework that addresses the multifaceted nature of alignment by maximizing a primary objective while satisfying threshold-based constraints on secondary criteria.<n>We provide theoretical insights by deriving sub-optimality bounds of our satisficing based inference alignment approach.
arXiv Detail & Related papers (2025-05-29T17:56:05Z) - Towards Measuring Goal-Directedness in AI Systems [0.0]
A key prerequisite for AI systems pursuing unintended goals is whether they will behave in a coherent and goal-directed manner.
We propose a new family of definitions of the goal-directedness of a policy that analyze whether it is well-modeled as near-optimal for many reward functions.
Our contribution is a definition of goal-directedness that is simpler and more easily computable in order to approach the question of whether AI systems could pursue dangerous goals.
arXiv Detail & Related papers (2024-10-07T01:34:42Z) - Sound Heuristic Search Value Iteration for Undiscounted POMDPs with Reachability Objectives [16.101435842520473]
This paper studies the challenging yet important problem in POMDPs known as the (indefinite-horizon) Maximal Reachability Probability Problem.
Inspired by the success of point-based methods developed for discounted problems, we study their extensions to MRPP.
We present a novel algorithm that leverages the strengths of these techniques for efficient exploration of the belief space.
arXiv Detail & Related papers (2024-06-05T02:33:50Z) - Learning Algorithms for Verification of Markov Decision Processes [20.5951492453299]
We present a general framework for applying learning algorithms to the verification of Markov decision processes (MDPs)<n>The presented framework focuses on probabilistic reachability, which is a core problem in verification.
arXiv Detail & Related papers (2024-03-14T08:54:19Z) - Provable Offline Preference-Based Reinforcement Learning [95.00042541409901]
We investigate the problem of offline Preference-based Reinforcement Learning (PbRL) with human feedback.
We consider the general reward setting where the reward can be defined over the whole trajectory.
We introduce a new single-policy concentrability coefficient, which can be upper bounded by the per-trajectory concentrability.
arXiv Detail & Related papers (2023-05-24T07:11:26Z) - Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - Alleviating Search Bias in Bayesian Evolutionary Optimization with Many
Heterogeneous Objectives [9.139734850798124]
We deal with multi-objective optimization problems with heterogeneous objectives (HE-MOPs)
A new acquisition function that mitigates search bias towards the fast objectives is suggested.
We demonstrate the effectiveness of the proposed algorithm by testing it on widely used multi-/many-objective benchmark problems.
arXiv Detail & Related papers (2022-08-25T17:07:40Z) - Generative multitask learning mitigates target-causing confounding [61.21582323566118]
We propose a simple and scalable approach to causal representation learning for multitask learning.
The improvement comes from mitigating unobserved confounders that cause the targets, but not the input.
Our results on the Attributes of People and Taskonomy datasets reflect the conceptual improvement in robustness to prior probability shift.
arXiv Detail & Related papers (2022-02-08T20:42:14Z) - Adversarial Intrinsic Motivation for Reinforcement Learning [60.322878138199364]
We investigate whether the Wasserstein-1 distance between a policy's state visitation distribution and a target distribution can be utilized effectively for reinforcement learning tasks.
Our approach, termed Adversarial Intrinsic Motivation (AIM), estimates this Wasserstein-1 distance through its dual objective and uses it to compute a supplemental reward function.
arXiv Detail & Related papers (2021-05-27T17:51:34Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.