Will My Robot Achieve My Goals? Predicting the Probability that an MDP Policy Reaches a User-Specified Behavior Target
- URL: http://arxiv.org/abs/2211.16462v2
- Date: Tue, 2 Apr 2024 21:15:23 GMT
- Title: Will My Robot Achieve My Goals? Predicting the Probability that an MDP Policy Reaches a User-Specified Behavior Target
- Authors: Alexander Guyer, Thomas G. Dietterich,
- Abstract summary: As an autonomous system performs a task, it should maintain a calibrated estimate of the probability that it will achieve the user's goal.
This paper considers settings where the user's goal is specified as a target interval for a real-valued performance summary.
We compute the probability estimates by inverting conformal prediction.
- Score: 56.99669411766284
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As an autonomous system performs a task, it should maintain a calibrated estimate of the probability that it will achieve the user's goal. If that probability falls below some desired level, it should alert the user so that appropriate interventions can be made. This paper considers settings where the user's goal is specified as a target interval for a real-valued performance summary, such as the cumulative reward, measured at a fixed horizon $H$. At each time $t \in \{0, \ldots, H-1\}$, our method produces a calibrated estimate of the probability that the final cumulative reward will fall within a user-specified target interval $[y^-,y^+].$ Using this estimate, the autonomous system can raise an alarm if the probability drops below a specified threshold. We compute the probability estimates by inverting conformal prediction. Our starting point is the Conformalized Quantile Regression (CQR) method of Romano et al., which applies split-conformal prediction to the results of quantile regression. CQR is not invertible, but by using the conditional cumulative distribution function (CDF) as the non-conformity measure, we show how to obtain an invertible modification that we call Probability-space Conformalized Quantile Regression (PCQR). Like CQR, PCQR produces well-calibrated conditional prediction intervals with finite-sample marginal guarantees. By inverting PCQR, we obtain guarantees for the probability that the cumulative reward of an autonomous system will fall below a threshold sampled from the marginal distribution of the response variable (i.e., a calibrated CDF estimate) that we employ to predict coverage probabilities for user-specified target intervals. Experiments on two domains confirm that these probabilities are well-calibrated.
Related papers
- Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables.
We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure.
We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z) - Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs.
We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint.
We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z) - Verifiably Robust Conformal Prediction [1.391198481393699]
This paper introduces VRCP (Verifiably Robust Conformal Prediction), a new framework that leverages neural network verification methods to recover coverage guarantees under adversarial attacks.
Our method is the first to support perturbations bounded by arbitrary norms including $ell1$, $ell2$, and $ellinfty$, as well as regression tasks.
In every case, VRCP achieves above nominal coverage and yields significantly more efficient and informative prediction regions than the SotA.
arXiv Detail & Related papers (2024-05-29T09:50:43Z) - Equal Opportunity of Coverage in Fair Regression [50.76908018786335]
We study fair machine learning (ML) under predictive uncertainty to enable reliable and trustworthy decision-making.
We propose Equal Opportunity of Coverage (EOC) that aims to achieve two properties: (1) coverage rates for different groups with similar outcomes are close, and (2) the coverage rate for the entire population remains at a predetermined level.
arXiv Detail & Related papers (2023-11-03T21:19:59Z) - PAC Prediction Sets Under Label Shift [52.30074177997787]
Prediction sets capture uncertainty by predicting sets of labels rather than individual labels.
We propose a novel algorithm for constructing prediction sets with PAC guarantees in the label shift setting.
We evaluate our approach on five datasets.
arXiv Detail & Related papers (2023-10-19T17:57:57Z) - Integrating Uncertainty Awareness into Conformalized Quantile Regression [12.875863572064986]
We propose a new variant of the Conformalized Quantile Regression (CQR) methodology to adjust quantile regressors differentially across the feature space.
Compared to CQR, our methods enjoy the same distribution-free theoretical coverage guarantees, while demonstrating stronger conditional coverage properties in simulated settings and real-world data sets alike.
arXiv Detail & Related papers (2023-06-14T18:28:53Z) - Post-selection Inference for Conformal Prediction: Trading off Coverage
for Precision [0.0]
Traditionally, conformal prediction inference requires a data-independent specification of miscoverage level.
We develop simultaneous conformal inference to account for data-dependent miscoverage levels.
arXiv Detail & Related papers (2023-04-12T20:56:43Z) - Conformal Prediction Intervals for Markov Decision Process Trajectories [10.68332392039368]
This paper provides conformal prediction intervals over the future behavior of an autonomous system executing a fixed control policy on a Markov Decision Process (MDP)
The method is illustrated on MDPs for invasive species management and StarCraft2 battles.
arXiv Detail & Related papers (2022-06-10T03:43:53Z) - Conditionally Calibrated Predictive Distributions by
Probability-Probability Map: Application to Galaxy Redshift Estimation and
Probabilistic Forecasting [4.186140302617659]
Uncertainty is crucial for assessing the predictive ability of AI algorithms.
We propose textttCal-PIT, a method that addresses both PD diagnostics and recalibration.
We benchmark our corrected prediction bands against oracle bands and state-of-the-art predictive inference algorithms.
arXiv Detail & Related papers (2022-05-29T03:52:44Z) - Multivariate Probabilistic Regression with Natural Gradient Boosting [63.58097881421937]
We propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution.
Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches.
arXiv Detail & Related papers (2021-06-07T17:44:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.