Distillation to Enhance the Portability of Risk Models Across
Institutions with Large Patient Claims Database
- URL: http://arxiv.org/abs/2207.02445v1
- Date: Wed, 6 Jul 2022 05:26:32 GMT
- Title: Distillation to Enhance the Portability of Risk Models Across
Institutions with Large Patient Claims Database
- Authors: Steve Nyemba, Chao Yan, Ziqi Zhang, Amol Rajmane, Pablo Meyer,
Prithwish Chakraborty, Bradley Malin
- Abstract summary: We investigate the practicality of model portability through a cross-site evaluation of readmission prediction models.
We apply a recurrent neural network, augmented with self-attention and blended with expert features, to build readmission prediction models.
Our experiments show that direct application of ML models trained at one institution and tested at another institution perform worse than models trained and tested at the same institution.
- Score: 12.452703677540505
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial intelligence, and particularly machine learning (ML), is
increasingly developed and deployed to support healthcare in a variety of
settings. However, clinical decision support (CDS) technologies based on ML
need to be portable if they are to be adopted on a broad scale. In this
respect, models developed at one institution should be reusable at another. Yet
there are numerous examples of portability failure, particularly due to naive
application of ML models. Portability failure can lead to suboptimal care and
medical errors, which ultimately could prevent the adoption of ML-based CDS in
practice. One specific healthcare challenge that could benefit from enhanced
portability is the prediction of 30-day readmission risk. Research to date has
shown that deep learning models can be effective at modeling such risk. In this
work, we investigate the practicality of model portability through a cross-site
evaluation of readmission prediction models. To do so, we apply a recurrent
neural network, augmented with self-attention and blended with expert features,
to build readmission prediction models for two independent large scale claims
datasets. We further present a novel transfer learning technique that adapts
the well-known method of born-again network (BAN) training. Our experiments
show that direct application of ML models trained at one institution and tested
at another institution perform worse than models trained and tested at the same
institution. We further show that the transfer learning approach based on the
BAN produces models that are better than those trained on just a single
institution's data. Notably, this improvement is consistent across both sites
and occurs after a single retraining, which illustrates the potential for a
cheap and general model transfer mechanism of readmission risk prediction.
Related papers
- Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Evaluation of Predictive Reliability to Foster Trust in Artificial
Intelligence. A case study in Multiple Sclerosis [0.34473740271026115]
Spotting Machine Learning failures is of paramount importance when ML predictions are used to drive clinical decisions.
We propose a simple approach that can be used in the deployment phase of any ML model to suggest whether to trust predictions or not.
Our method holds the promise to provide effective support to clinicians by spotting potential ML failures during deployment.
arXiv Detail & Related papers (2024-02-27T14:48:07Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Safe AI for health and beyond -- Monitoring to transform a health
service [51.8524501805308]
We will assess the infrastructure required to monitor the outputs of a machine learning algorithm.
We will present two scenarios with examples of monitoring and updates of models.
arXiv Detail & Related papers (2023-03-02T17:27:45Z) - Transfer Learning with Uncertainty Quantification: Random Effect
Calibration of Source to Target (RECaST) [1.8047694351309207]
We develop a statistical framework for model predictions based on transfer learning, called RECaST.
We mathematically and empirically demonstrate the validity of our RECaST approach for transfer learning between linear models.
We examine our method's performance in a simulation study and in an application to real hospital data.
arXiv Detail & Related papers (2022-11-29T19:39:47Z) - Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of
Foundation Models [103.71308117592963]
We present an algorithm for training self-destructing models leveraging techniques from meta-learning and adversarial learning.
In a small-scale experiment, we show MLAC can largely prevent a BERT-style model from being re-purposed to perform gender identification.
arXiv Detail & Related papers (2022-11-27T21:43:45Z) - Continual Learning with Bayesian Model based on a Fixed Pre-trained
Feature Extractor [55.9023096444383]
Current deep learning models are characterised by catastrophic forgetting of old knowledge when learning new classes.
Inspired by the process of learning new knowledge in human brains, we propose a Bayesian generative model for continual learning.
arXiv Detail & Related papers (2022-04-28T08:41:51Z) - The unreasonable effectiveness of Batch-Norm statistics in addressing
catastrophic forgetting across medical institutions [8.244654685687054]
We investigate trade-off between model refinement and retention of previously learned knowledge.
We propose a simple yet effective approach, adapting Elastic weight consolidation (EWC) using the global batch normalization statistics of the original dataset.
arXiv Detail & Related papers (2020-11-16T16:57:05Z) - Democratizing Artificial Intelligence in Healthcare: A Study of Model
Development Across Two Institutions Incorporating Transfer Learning [8.043077408518826]
Transfer learning (TL) allows a fully trained model from one institution to be fine-tuned by another institution using a much small local dataset.
This report describes the challenges, methodology, and benefits of TL within the context of developing an AI model for a basic use-case.
arXiv Detail & Related papers (2020-09-25T21:12:50Z) - Transfer Learning without Knowing: Reprogramming Black-box Machine
Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model.
Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses.
BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.