Holdouts set for safe predictive model updating
- URL: http://arxiv.org/abs/2202.06374v5
- Date: Thu, 19 Dec 2024 10:12:00 GMT
- Title: Holdouts set for safe predictive model updating
- Authors: Sami Haidar-Wehbe, Samuel R Emerson, Louis J M Aslett, James Liley,
- Abstract summary: We propose using a holdout set' - a subset of the population that does not receive interventions guided by the risk score.
We show that, in order to minimise the number of pre-eclampsia cases over time, this is best achieved using a holdout set of around 10,000 individuals.
- Score: 0.4499833362998489
- License:
- Abstract: Predictive risk scores for adverse outcomes are increasingly crucial in guiding health interventions. Such scores may need to be periodically updated due to change in the distributions they model. However, directly updating risk scores used to guide intervention can lead to biased risk estimates. To address this, we propose updating using a `holdout set' - a subset of the population that does not receive interventions guided by the risk score. Balancing the holdout set size is essential to ensure good performance of the updated risk score whilst minimising the number of held out samples. We prove that this approach reduces adverse outcome frequency to an asymptotically optimal level and argue that often there is no competitive alternative. We describe conditions under which an optimal holdout size (OHS) can be readily identified, and introduce parametric and semi-parametric algorithms for OHS estimation. We apply our methods to the ASPRE risk score for pre-eclampsia to recommend a plan for updating it in the presence of change in the underlying data distribution. We show that, in order to minimise the number of pre-eclampsia cases over time, this is best achieved using a holdout set of around 10,000 individuals.
Related papers
- Mitigating optimistic bias in entropic risk estimation and optimization with an application to insurance [5.407319151576265]
The entropic risk measure is widely used to account for tail risks associated with an uncertain loss.
To mitigate the bias in the empirical entropic risk estimator, we propose a strongly consistent bootstrapping procedure.
We show that our methods suggest a higher (and more accurate) premium to homeowners.
arXiv Detail & Related papers (2024-09-30T04:02:52Z) - Teaching Models To Survive: Proper Scoring Rule and Stochastic Optimization with Competing Risks [6.9648613217501705]
When data are right-censored, survival analysis can compute the "time to event"
We introduce a strictly proper censoring-adjusted separable scoring rule that can be optimized on a subpart of the data.
Compared to 11 state-of-the-art models, this model, MultiIncidence, performs best in estimating the probability of outcomes in survival and competing risks.
arXiv Detail & Related papers (2024-06-20T08:00:42Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - Safe Deployment for Counterfactual Learning to Rank with Exposure-Based
Risk Minimization [63.93275508300137]
We introduce a novel risk-aware Counterfactual Learning To Rank method with theoretical guarantees for safe deployment.
Our experimental results demonstrate the efficacy of our proposed method, which is effective at avoiding initial periods of bad performance when little data is available.
arXiv Detail & Related papers (2023-04-26T15:54:23Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - Minimax Off-Policy Evaluation for Multi-Armed Bandits [58.7013651350436]
We study the problem of off-policy evaluation in the multi-armed bandit model with bounded rewards.
We develop minimax rate-optimal procedures under three settings.
arXiv Detail & Related papers (2021-01-19T18:55:29Z) - WRSE -- a non-parametric weighted-resolution ensemble for predicting
individual survival distributions in the ICU [0.251657752676152]
Dynamic assessment of mortality risk in the intensive care unit (ICU) can be used to stratify patients, inform about treatment effectiveness or serve as part of an early-warning system.
We show competitive results with state-of-the-art probabilistic models, while greatly reducing training time by factors of 2-9x.
arXiv Detail & Related papers (2020-11-02T10:13:59Z) - DeepHazard: neural network for time-varying risks [0.6091702876917281]
We propose a new flexible method for survival prediction: DeepHazard, a neural network for time-varying risks.
Our approach is tailored for a wide range of continuous hazards forms, with the only restriction of being additive in time.
Numerical examples illustrate that our approach outperforms existing state-of-the-art methodology in terms of predictive capability evaluated through the C-index metric.
arXiv Detail & Related papers (2020-07-26T21:01:49Z) - Survival Cluster Analysis [93.50540270973927]
There is an unmet need in survival analysis for identifying subpopulations with distinct risk profiles.
An approach that addresses this need is likely to improve characterization of individual outcomes.
arXiv Detail & Related papers (2020-02-29T22:41:21Z) - Orthogonal Statistical Learning [49.55515683387805]
We provide non-asymptotic excess risk guarantees for statistical learning in a setting where the population risk depends on an unknown nuisance parameter.
We show that if the population risk satisfies a condition called Neymanity, the impact of the nuisance estimation error on the excess risk bound achieved by the meta-algorithm is of second order.
arXiv Detail & Related papers (2019-01-25T02:21:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.