Related papers: Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees

Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees

URL: http://arxiv.org/abs/2110.06866v1
Date: Wed, 13 Oct 2021 17:03:21 GMT
Title: Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees
Authors: Jean Feng, Alexej Gossmann, Berkman Sahiner, Romain Pirracchio
Abstract summary: We introduce two procedures for continual recalibration or revision of an underlying prediction model. We perform empirical evaluation via simulations and a real-world study predicting COPD risk. We derive "Type I and II" regret bounds, which guarantee the procedures are non-inferior to a static model and competitive with an oracle logistic reviser.
Score: 6.709991492637819
License: http://creativecommons.org/licenses/by/4.0/
Abstract: After deploying a clinical prediction model, subsequently collected data can be used to fine-tune its predictions and adapt to temporal shifts. Because model updating carries risks of over-updating/fitting, we study online methods with performance guarantees. We introduce two procedures for continual recalibration or revision of an underlying prediction model: Bayesian logistic regression (BLR) and a Markov variant that explicitly models distribution shifts (MarBLR). We perform empirical evaluation via simulations and a real-world study predicting COPD risk. We derive "Type I and II" regret bounds, which guarantee the procedures are non-inferior to a static model and competitive with an oracle logistic reviser in terms of the average loss. Both procedures consistently outperformed the static model and other online logistic revision methods. In simulations, the average estimated calibration index (aECI) of the original model was 0.828 (95%CI 0.818-0.938). Online recalibration using BLR and MarBLR improved the aECI, attaining 0.265 (95%CI 0.230-0.300) and 0.241 (95%CI 0.216-0.266), respectively. When performing more extensive logistic model revisions, BLR and MarBLR increased the average AUC (aAUC) from 0.767 (95%CI 0.765-0.769) to 0.800 (95%CI 0.798-0.802) and 0.799 (95%CI 0.797-0.801), respectively, in stationary settings and protected against substantial model decay. In the COPD study, BLR and MarBLR dynamically combined the original model with a continually-refitted gradient boosted tree to achieve aAUCs of 0.924 (95%CI 0.913-0.935) and 0.925 (95%CI 0.914-0.935), compared to the static model's aAUC of 0.904 (95%CI 0.892-0.916). Despite its simplicity, BLR is highly competitive with MarBLR. MarBLR outperforms BLR when its prior better reflects the data. BLR and MarBLR can improve the transportability of clinical prediction models and maintain their performance over time.

Related papers

Distributionally Robust Learning in Survival Analysis [6.946903076677842]
We introduce an innovative approach that incorporates a Distributionally Robust Learning (DRL) approach into Cox regression.<n>By formulating a DRL framework with a Wasserstein distance-based ambiguity set, we develop a variant Cox model that is less sensitive to assumptions about the underlying data distribution.<n>We demonstrate that our regression model achieves superior performance in terms of prediction accuracy and robustness compared with traditional methods.
arXiv Detail & Related papers (2025-06-02T06:11:22Z)
A SHAP-based explainable multi-level stacking ensemble learning method for predicting the length of stay in acute stroke [3.2906073576204955]
Existing machine learning models have shown suboptimal predictive performance, limited generalisability, and have overlooked system-level factors.<n>We developed an interpretable multi-level stacking ensemble model for ischaemic and haemorrhagic stroke.<n>An explainable ensemble model effectively predicted the prolonged LOS in ischaemic stroke.<n>Further validation is needed for haemorrhagic stroke.
arXiv Detail & Related papers (2025-05-30T01:08:26Z)
Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing [58.52119063742121]
Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance.<n>This paper addresses the question of how to optimally combine the model's predictions and the provided labels.<n>Our main contribution is the derivation of the Bayes optimal aggregator function to combine the current model's predictions and the given labels.
arXiv Detail & Related papers (2025-05-21T07:16:44Z)
Enhancing Retail Sales Forecasting with Optimized Machine Learning Models [0.0]
In retail sales forecasting, accurately predicting future sales is crucial for inventory management and strategic planning. Recent advancements in machine learning (ML) provide more robust alternatives. This research benefits from the power of ML, particularly Random Forest (RF), Gradient Boosting (GB), Support Vector Regression (SVR), and XGBoost.
arXiv Detail & Related papers (2024-10-17T17:11:33Z)
A Mixture of Experts (MoE) model to improve AI-based computational pathology prediction performance under variable levels of histopathology image blur [0.0]
We introduce a mixture of experts (MoE) strategy that combines predictions from multiple expert models trained on data with varying blur levels.<n>Our results show that baseline models' performance consistently decreased with increasing blur.<n>MoE-CNN_CLAM outperformed the baseline CNN_CLAM under moderate and mixed blur conditions.
arXiv Detail & Related papers (2024-05-15T12:40:41Z)
Test-Time Adaptation Induces Stronger Accuracy and Agreement-on-the-Line [65.14099135546594]
Recent test-time adaptation (TTA) methods drastically strengthen the ACL and AGL trends in models, even in shifts where models showed very weak correlations before. Our results show that by combining TTA with AGL-based estimation methods, we can estimate the OOD performance of models with high precision for a broader set of distribution shifts.
arXiv Detail & Related papers (2023-10-07T23:21:25Z)
Guided Diffusion Model for Adversarial Purification from Random Noise [0.0]
We propose a novel guided diffusion purification approach to provide a strong defense against adversarial attacks. Our model achieves 89.62% robust accuracy under PGD-L_inf attack (eps = 8/255) on the CIFAR-10 dataset.
arXiv Detail & Related papers (2022-06-22T06:55:03Z)
Posterior Coreset Construction with Kernelized Stein Discrepancy for Model-Based Reinforcement Learning [78.30395044401321]
We develop a novel model-based approach to reinforcement learning (MBRL) It relaxes the assumptions on the target transition model to belong to a generic family of mixture models. It can achieve up-to 50 percent reduction in wall clock time in some continuous control environments.
arXiv Detail & Related papers (2022-06-02T17:27:49Z)
Estimation of Bivariate Structural Causal Models by Variational Gaussian Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models. One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z)
An Interpretable Web-based Glioblastoma Multiforme Prognosis Prediction Tool using Random Forest Model [1.1024591739346292]
We propose predictive models that estimate GBM patients' health status of one-year after treatments. We used total of 467 GBM patients' clinical profile consists of 13 features and two follow-up dates. Our machine learning models suggest that the top three prognostic factors for GBM patient survival were MGMT gene promoter, the extent of resection, and age.
arXiv Detail & Related papers (2021-08-30T07:56:34Z)
Residual Energy-Based Models for End-to-End Speech Recognition [26.852537542649866]
Residual energy-based model (R-EBM) is proposed to complement the auto-regressive ASR model. Experiments on a 100hr LibriSpeech dataset show that R-EBMs can reduce the word error rates (WERs) by 8.2%/6.7%. On a state-of-the-art model using self-supervised learning (wav2vec 2.0), R-EBMs still significantly improves both the WER and confidence estimation performance.
arXiv Detail & Related papers (2021-03-25T22:08:00Z)
Increasing the efficiency of randomized trial estimates via linear adjustment for a prognostic score [59.75318183140857]
Estimating causal effects from randomized experiments is central to clinical research. Most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control.
arXiv Detail & Related papers (2020-12-17T21:10:10Z)
Learnable Boundary Guided Adversarial Training [66.57846365425598]
We use the model logits from one clean model to guide learning of another one robust model. We achieve new state-of-the-art robustness on CIFAR-100 without additional real or synthetic data.
arXiv Detail & Related papers (2020-11-23T01:36:05Z)
UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model. UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data. We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD) UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z)
Short-term forecasting COVID-19 cumulative confirmed cases: Perspectives for Brazil [3.0711362702464675]
The new Coronavirus (COVID-19) is an emerging disease responsible for infecting millions of people since the first notification until nowadays. In this paper, autoregressive integrated moving average (ARIMA), cubist (CUBIST), random forest (RF), ridge regression (RIDGE), and stacking-ensemble learning are evaluated. The developed models can generate accurate forecasting, achieving errors in a range of 0.87% - 3.51%, 1.02% - 5.63%, and 0.95% - 6.90% in one, three, and six-days-ahead, respectively.
arXiv Detail & Related papers (2020-07-21T17:58:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.