Novel Techniques to Assess Predictive Systems and Reduce Their Alarm
Burden
- URL: http://arxiv.org/abs/2102.05691v1
- Date: Wed, 10 Feb 2021 19:05:06 GMT
- Title: Novel Techniques to Assess Predictive Systems and Reduce Their Alarm
Burden
- Authors: Jonathan A. Handler, Craig F. Feied, Michael T. Gillam
- Abstract summary: We introduce an improved performance assessment technique ("u-metrics") using utility functions to score each prediction.
Compared to traditional performance measures, u-metrics more accurately reflect the real-world benefits and costs of a predictor operating in a workflow context.
We also describe the use of "snoozing," a method whereby predictions are suppressed for a period of time, commonly improving predictor performance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The performance of a binary classifier ("predictor") depends heavily upon the
context ("workflow") in which it operates. Classic measures of predictor
performance do not reflect the realized utility of predictors unless certain
implied workflow assumptions are met. Failure to meet these implied assumptions
results in suboptimal classifier implementations and a mismatch between
predicted or assessed performance and the actual performance obtained in
real-world deployments. The mismatch commonly arises when multiple predictions
can be made for the same event, the event is relatively rare, and redundant
true positive predictions for the same event add little value, e.g., a system
that makes a prediction each minute, repeatedly issuing interruptive alarms for
a predicted event that may never occur.
We explain why classic metrics do not correctly represent the performance of
predictors in such contexts, and introduce an improved performance assessment
technique ("u-metrics") using utility functions to score each prediction.
U-metrics explicitly account for variability in prediction utility arising from
temporal relationships. Compared to traditional performance measures, u-metrics
more accurately reflect the real-world benefits and costs of a predictor
operating in a workflow context. The difference can be significant.
We also describe the use of "snoozing," a method whereby predictions are
suppressed for a period of time, commonly improving predictor performance by
reducing false positives while retaining the capture of events. Snoozing is
especially useful when predictors generate interruptive alerts, as so often
happens in clinical practice. Utility-based performance metrics correctly
predict and track the performance benefits of snoozing, whereas traditional
performance metrics do not.
Related papers
- Imputation for prediction: beware of diminishing returns [12.424671213282256]
Missing values are prevalent across various fields, posing challenges for training and deploying predictive models.
Recent theoretical and empirical studies indicate that simple constant imputation can be consistent and competitive.
This study aims at clarifying if and when investing in advanced imputation methods yields significantly better predictions.
arXiv Detail & Related papers (2024-07-29T09:01:06Z) - Contract Scheduling with Distributional and Multiple Advice [37.64065953072774]
Previous work has showed that a prediction on the interruption time can help improve the performance of contract-based systems.
We introduce and study more general and realistic learning-augmented settings in which the prediction is in the form of a probability distribution.
We show that the resulting system is robust to prediction errors in the distributional setting.
arXiv Detail & Related papers (2024-04-18T19:58:11Z) - Performative Time-Series Forecasting [71.18553214204978]
We formalize performative time-series forecasting (PeTS) from a machine-learning perspective.
We propose a novel approach, Feature Performative-Shifting (FPS), which leverages the concept of delayed response to anticipate distribution shifts.
We conduct comprehensive experiments using multiple time-series models on COVID-19 and traffic forecasting tasks.
arXiv Detail & Related papers (2023-10-09T18:34:29Z) - Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores.
We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z) - Predictive Inference with Feature Conformal Prediction [80.77443423828315]
We propose feature conformal prediction, which extends the scope of conformal prediction to semantic feature spaces.
From a theoretical perspective, we demonstrate that feature conformal prediction provably outperforms regular conformal prediction under mild assumptions.
Our approach could be combined with not only vanilla conformal prediction, but also other adaptive conformal prediction methods.
arXiv Detail & Related papers (2022-10-01T02:57:37Z) - Efficient and Differentiable Conformal Prediction with General Function
Classes [96.74055810115456]
We propose a generalization of conformal prediction to multiple learnable parameters.
We show that it achieves approximate valid population coverage and near-optimal efficiency within class.
Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly.
arXiv Detail & Related papers (2022-02-22T18:37:23Z) - Optimized conformal classification using gradient descent approximation [0.2538209532048866]
Conformal predictors allow predictions to be made with a user-defined confidence level.
We consider an approach to train the conformal predictor directly with maximum predictive efficiency.
We test the method on several real world data sets and find that the method is promising.
arXiv Detail & Related papers (2021-05-24T13:14:41Z) - All-Clear Flare Prediction Using Interval-based Time Series Classifiers [0.21028463367241026]
An all-clear flare prediction is a type of solar flare forecasting that puts more emphasis on predicting non-flaring instances.
Finding the right balance between avoiding false negatives (misses) and reducing the false positives (false alarms) is often challenging.
arXiv Detail & Related papers (2021-05-03T22:40:05Z) - Towards More Fine-grained and Reliable NLP Performance Prediction [85.78131503006193]
We make two contributions to improving performance prediction for NLP tasks.
First, we examine performance predictors for holistic measures of accuracy like F1 or BLEU.
Second, we propose methods to understand the reliability of a performance prediction model from two angles: confidence intervals and calibration.
arXiv Detail & Related papers (2021-02-10T15:23:20Z) - Ambiguity in Sequential Data: Predicting Uncertain Futures with
Recurrent Models [110.82452096672182]
We propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data.
We also introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties.
arXiv Detail & Related papers (2020-03-10T09:15:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.