Inadequacy of common stochastic neural networks for reliable clinical
decision support
- URL: http://arxiv.org/abs/2401.13657v2
- Date: Thu, 25 Jan 2024 12:31:21 GMT
- Title: Inadequacy of common stochastic neural networks for reliable clinical
decision support
- Authors: Adrian Lindenmeyer, Malte Blattmann, Stefan Franke, Thomas Neumuth,
Daniel Schneider
- Abstract summary: Widespread adoption of AI for medical decision making is still hindered due to ethical and safety-related concerns.
Common deep learning approaches, however, have the tendency towards overconfidence under data shift.
This study investigates their actual reliability in clinical applications.
- Score: 0.4262974002462632
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Widespread adoption of AI for medical decision making is still hindered due
to ethical and safety-related concerns. For AI-based decision support systems
in healthcare settings it is paramount to be reliable and trustworthy. Common
deep learning approaches, however, have the tendency towards overconfidence
under data shift. Such inappropriate extrapolation beyond evidence-based
scenarios may have dire consequences. This highlights the importance of
reliable estimation of local uncertainty and its communication to the end user.
While stochastic neural networks have been heralded as a potential solution to
these issues, this study investigates their actual reliability in clinical
applications. We centered our analysis on the exemplary use case of mortality
prediction for ICU hospitalizations using EHR from MIMIC3 study. For
predictions on the EHR time series, Encoder-Only Transformer models were
employed. Stochasticity of model functions was achieved by incorporating common
methods such as Bayesian neural network layers and model ensembles. Our models
achieve state of the art performance in terms of discrimination performance
(AUC ROC: 0.868+-0.011, AUC PR: 0.554+-0.034) and calibration on the mortality
prediction benchmark. However, epistemic uncertainty is critically
underestimated by the selected stochastic deep learning methods. A heuristic
proof for the responsible collapse of the posterior distribution is provided.
Our findings reveal the inadequacy of commonly used stochastic deep learning
approaches to reliably recognize OoD samples. In both methods, unsubstantiated
model confidence is not prevented due to strongly biased functional posteriors,
rendering them inappropriate for reliable clinical decision support. This
highlights the need for approaches with more strictly enforced or inherent
distance-awareness to known data points, e.g., using kernel-based techniques.
Related papers
- Trust-informed Decision-Making Through An Uncertainty-Aware Stacked Neural Networks Framework: Case Study in COVID-19 Classification [10.265080819932614]
This study presents an uncertainty-aware stacked neural networks model for the reliable classification of COVID-19 from radiological images.
The model addresses the critical gap in uncertainty-aware modeling by focusing on accurately identifying confidently correct predictions.
The architecture integrates uncertainty quantification methods, including Monte Carlo dropout and ensemble techniques, to enhance predictive reliability.
arXiv Detail & Related papers (2024-09-19T04:20:12Z) - SepsisLab: Early Sepsis Prediction with Uncertainty Quantification and Active Sensing [67.8991481023825]
Sepsis is the leading cause of in-hospital mortality in the USA.
Existing predictive models are usually trained on high-quality data with few missing information.
For the potential high-risk patients with low confidence due to limited observations, we propose a robust active sensing algorithm.
arXiv Detail & Related papers (2024-07-24T04:47:36Z) - Tractable Function-Space Variational Inference in Bayesian Neural
Networks [72.97620734290139]
A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters.
We propose a scalable function-space variational inference method that allows incorporating prior information.
We show that the proposed method leads to state-of-the-art uncertainty estimation and predictive performance on a range of prediction tasks.
arXiv Detail & Related papers (2023-12-28T18:33:26Z) - New Epochs in AI Supervision: Design and Implementation of an Autonomous
Radiology AI Monitoring System [5.50085484902146]
We introduce novel methods for monitoring the performance of radiology AI classification models in practice.
We propose two metrics - predictive divergence and temporal stability - to be used for preemptive alerts of AI performance changes.
arXiv Detail & Related papers (2023-11-24T06:29:04Z) - Can input reconstruction be used to directly estimate uncertainty of a
regression U-Net model? -- Application to proton therapy dose prediction for
head and neck cancer patients [0.8343441027226364]
We present an alternative direct uncertainty estimation method and apply it for a regression U-Net architecture.
For the proof-of-concept, our method is applied to proton therapy dose prediction in head and neck cancer patients.
arXiv Detail & Related papers (2023-10-30T16:04:34Z) - Towards Reliable Medical Image Segmentation by utilizing Evidential Calibrated Uncertainty [52.03490691733464]
We introduce DEviS, an easily implementable foundational model that seamlessly integrates into various medical image segmentation networks.
By leveraging subjective logic theory, we explicitly model probability and uncertainty for the problem of medical image segmentation.
DeviS incorporates an uncertainty-aware filtering module, which utilizes the metric of uncertainty-calibrated error to filter reliable data.
arXiv Detail & Related papers (2023-01-01T05:02:46Z) - BayesNetCNN: incorporating uncertainty in neural networks for
image-based classification tasks [0.29005223064604074]
We propose a method to convert a standard neural network into a Bayesian neural network.
We estimate the variability of predictions by sampling different networks similar to the original one at each forward pass.
We test our model in a large cohort of brain images from Alzheimer's Disease patients.
arXiv Detail & Related papers (2022-09-27T01:07:19Z) - Improving Trustworthiness of AI Disease Severity Rating in Medical
Imaging with Ordinal Conformal Prediction Sets [0.7734726150561088]
A lack of statistically rigorous uncertainty quantification is a significant factor undermining trust in AI results.
Recent developments in distribution-free uncertainty quantification present practical solutions for these issues.
We demonstrate a technique for forming ordinal prediction sets that are guaranteed to contain the correct stenosis severity.
arXiv Detail & Related papers (2022-07-05T18:01:20Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.