Robust Counterfactual Inference in Markov Decision Processes
- URL: http://arxiv.org/abs/2502.13731v1
- Date: Wed, 19 Feb 2025 13:56:20 GMT
- Title: Robust Counterfactual Inference in Markov Decision Processes
- Authors: Jessica Lally, Milad Kazemi, Nicola Paoletti,
- Abstract summary: Current approaches assume a specific causal model to make counterfactuals identifiable.
We propose a novel non-parametric approach that computes tight bounds on counterfactual transition probabilities.
- Score: 1.5197843979051473
- License:
- Abstract: This paper addresses a key limitation in existing counterfactual inference methods for Markov Decision Processes (MDPs). Current approaches assume a specific causal model to make counterfactuals identifiable. However, there are usually many causal models that align with the observational and interventional distributions of an MDP, each yielding different counterfactual distributions, so fixing a particular causal model limits the validity (and usefulness) of counterfactual inference. We propose a novel non-parametric approach that computes tight bounds on counterfactual transition probabilities across all compatible causal models. Unlike previous methods that require solving prohibitively large optimisation problems (with variables that grow exponentially in the size of the MDP), our approach provides closed-form expressions for these bounds, making computation highly efficient and scalable for non-trivial MDPs. Once such an interval counterfactual MDP is constructed, our method identifies robust counterfactual policies that optimise the worst-case reward w.r.t. the uncertain interval MDP probabilities. We evaluate our method on various case studies, demonstrating improved robustness over existing methods.
Related papers
- Solving Robust Markov Decision Processes: Generic, Reliable, Efficient [3.789219860006095]
Markov decision processes (MDP) are a well-established model for sequential decision-making in the presence of probabilities.
We provide a framework for solving RMDPs that is generic, reliable, and efficient.
Our prototype implementation outperforms existing tools by several orders of magnitude.
arXiv Detail & Related papers (2024-12-13T14:55:48Z) - Provably Efficient UCB-type Algorithms For Learning Predictive State
Representations [55.00359893021461]
The sequential decision-making problem is statistically learnable if it admits a low-rank structure modeled by predictive state representations (PSRs)
This paper proposes the first known UCB-type approach for PSRs, featuring a novel bonus term that upper bounds the total variation distance between the estimated and true models.
In contrast to existing approaches for PSRs, our UCB-type algorithms enjoy computational tractability, last-iterate guaranteed near-optimal policy, and guaranteed model accuracy.
arXiv Detail & Related papers (2023-07-01T18:35:21Z) - Robust Markov Decision Processes without Model Estimation [32.16801929347098]
There are two major barriers to applying robust MDPs in practice.
First, most works study robust MDPs in a model-based regime.
Second, prior work typically assumes a strong oracle to obtain the optimal solution.
arXiv Detail & Related papers (2023-02-02T17:29:10Z) - Robust Anytime Learning of Markov Decision Processes [8.799182983019557]
In data-driven applications, deriving precise probabilities from limited data introduces statistical errors.
Uncertain MDPs (uMDPs) do not require precise probabilities but instead use so-called uncertainty sets in the transitions.
We propose a robust anytime-learning approach that combines a dedicated Bayesian inference scheme with the computation of robust policies.
arXiv Detail & Related papers (2022-05-31T14:29:55Z) - Reinforcement Learning with a Terminator [80.34572413850186]
We learn the parameters of the TerMDP and leverage the structure of the estimation problem to provide state-wise confidence bounds.
We use these to construct a provably-efficient algorithm, which accounts for termination, and bound its regret.
arXiv Detail & Related papers (2022-05-30T18:40:28Z) - Adversarial Robustness Verification and Attack Synthesis in Stochastic
Systems [8.833548357664606]
We develop a formal framework for adversarial robustness in systems defined as discrete time Markov chains (DTMCs)
We outline a class of threat models under which adversaries can perturb system transitions, constrained by an $varepsilon$ ball around the original transition probabilities.
arXiv Detail & Related papers (2021-10-05T15:52:47Z) - Identification of Unexpected Decisions in Partially Observable
Monte-Carlo Planning: a Rule-Based Approach [78.05638156687343]
We propose a methodology for analyzing POMCP policies by inspecting their traces.
The proposed method explores local properties of policy behavior to identify unexpected decisions.
We evaluate our approach on Tiger, a standard benchmark for POMDPs, and a real-world problem related to mobile robot navigation.
arXiv Detail & Related papers (2020-12-23T15:09:28Z) - Efficient semidefinite-programming-based inference for binary and
multi-class MRFs [83.09715052229782]
We propose an efficient method for computing the partition function or MAP estimate in a pairwise MRF.
We extend semidefinite relaxations from the typical binary MRF to the full multi-class setting, and develop a compact semidefinite relaxation that can again be solved efficiently using the solver.
arXiv Detail & Related papers (2020-12-04T15:36:29Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z) - Polynomial-Time Exact MAP Inference on Discrete Models with Global
Dependencies [83.05591911173332]
junction tree algorithm is the most general solution for exact MAP inference with run-time guarantees.
We propose a new graph transformation technique via node cloning which ensures a run-time for solving our target problem independently of the form of a corresponding clique tree.
arXiv Detail & Related papers (2019-12-27T13:30:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.