A roadmap to fair and trustworthy prediction model validation in
healthcare
- URL: http://arxiv.org/abs/2304.03779v1
- Date: Fri, 7 Apr 2023 04:24:19 GMT
- Title: A roadmap to fair and trustworthy prediction model validation in
healthcare
- Authors: Yilin Ning, Victor Volovici, Marcus Eng Hock Ong, Benjamin Alan
Goldstein, Nan Liu
- Abstract summary: A prediction model is most useful if it generalizes beyond the development data.
We propose a roadmap that facilitates the development and application of reliable, fair, and trustworthy artificial intelligence prediction models.
- Score: 2.476158303361112
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A prediction model is most useful if it generalizes beyond the development
data with external validations, but to what extent should it generalize remains
unclear. In practice, prediction models are externally validated using data
from very different settings, including populations from other health systems
or countries, with predictably poor results. This may not be a fair reflection
of the performance of the model which was designed for a specific target
population or setting, and may be stretching the expected model
generalizability. To address this, we suggest to externally validate a model
using new data from the target population to ensure clear implications of
validation performance on model reliability, whereas model generalizability to
broader settings should be carefully investigated during model development
instead of explored post-hoc. Based on this perspective, we propose a roadmap
that facilitates the development and application of reliable, fair, and
trustworthy artificial intelligence prediction models.
Related papers
- Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - A performance characteristic curve for model evaluation: the application
in information diffusion prediction [3.8711489380602804]
We propose a metric based on information entropy to quantify the randomness in diffusion data, then identify a scaling pattern between the randomness and the prediction accuracy of the model.
Data points in the patterns by different sequence lengths, system sizes, and randomness all collapse into a single curve, capturing a model's inherent capability of making correct predictions.
The validity of the curve is tested by three prediction models in the same family, reaching conclusions in line with existing studies.
arXiv Detail & Related papers (2023-09-18T07:32:57Z) - Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points.
Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters.
We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z) - Loss Estimators Improve Model Generalization [36.520569284970456]
We propose to train a loss estimator alongside the predictive model, using a contrastive training objective, to directly estimate the prediction uncertainties.
We show the impact of loss estimators on model generalization, in terms of both its fidelity on in-distribution data and its ability to detect out of distribution samples or new classes unseen during training.
arXiv Detail & Related papers (2021-03-05T16:35:10Z) - Characterizing Fairness Over the Set of Good Models Under Selective
Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance.
We provide tractable algorithms to compute the range of attainable group-level predictive disparities.
We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z) - Learning Prediction Intervals for Model Performance [1.433758865948252]
We propose a method to compute prediction intervals for model performance.
We evaluate our approach across a wide range of drift conditions and show substantial improvement over competitive baselines.
arXiv Detail & Related papers (2020-12-15T21:32:03Z) - Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual
Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms.
We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance.
We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z) - Learning from others' mistakes: Avoiding dataset biases without modeling
them [111.17078939377313]
State-of-the-art natural language processing (NLP) models often learn to model dataset biases and surface form correlations instead of features that target the intended task.
Previous work has demonstrated effective methods to circumvent these issues when knowledge of the bias is available.
We show a method for training models that learn to ignore these problematic correlations.
arXiv Detail & Related papers (2020-12-02T16:10:54Z) - A Causal Lens for Peeking into Black Box Predictive Models: Predictive
Model Interpretation via Causal Attribution [3.3758186776249928]
We aim to address this problem in settings where the predictive model is a black box.
We reduce the problem of interpreting a black box predictive model to that of estimating the causal effects of each of the model inputs on the model output.
We show how the resulting causal attribution of responsibility for model output to the different model inputs can be used to interpret the predictive model and to explain its predictions.
arXiv Detail & Related papers (2020-08-01T23:20:57Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.