Pre-training transformer-based framework on large-scale pediatric claims
data for downstream population-specific tasks
- URL: http://arxiv.org/abs/2106.13095v1
- Date: Thu, 24 Jun 2021 15:25:41 GMT
- Title: Pre-training transformer-based framework on large-scale pediatric claims
data for downstream population-specific tasks
- Authors: Xianlong Zeng, Simon Lin, and Chang Liu
- Abstract summary: This study presents the Claim Pre-Training (Claim-PT) framework, a generic pre-training model that first trains on the entire pediatric claims dataset.
The effective knowledge transfer is completed through the task-aware fine-tuning stage.
We conducted experiments on a real-world claims dataset with more than one million patient records.
- Score: 3.1580072841682734
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The adoption of electronic health records (EHR) has become universal during
the past decade, which has afforded in-depth data-based research. By learning
from the large amount of healthcare data, various data-driven models have been
built to predict future events for different medical tasks, such as auto
diagnosis and heart-attack prediction. Although EHR is abundant, the population
that satisfies specific criteria for learning population-specific tasks is
scarce, making it challenging to train data-hungry deep learning models. This
study presents the Claim Pre-Training (Claim-PT) framework, a generic
pre-training model that first trains on the entire pediatric claims dataset,
followed by a discriminative fine-tuning on each population-specific task. The
semantic meaning of medical events can be captured in the pre-training stage,
and the effective knowledge transfer is completed through the task-aware
fine-tuning stage. The fine-tuning process requires minimal parameter
modification without changing the model architecture, which mitigates the data
scarcity issue and helps train the deep learning model adequately on small
patient cohorts. We conducted experiments on a real-world claims dataset with
more than one million patient records. Experimental results on two downstream
tasks demonstrated the effectiveness of our method: our general task-agnostic
pre-training framework outperformed tailored task-specific models, achieving
more than 10\% higher in model performance as compared to baselines. In
addition, our framework showed a great generalizability potential to transfer
learned knowledge from one institution to another, paving the way for future
healthcare model pre-training across institutions.
Related papers
- An Efficient Contrastive Unimodal Pretraining Method for EHR Time Series Data [35.943089444017666]
We propose an efficient method of contrastive pretraining tailored for long clinical timeseries data.
Our model demonstrates the ability to impute missing measurements, providing clinicians with deeper insights into patient conditions.
arXiv Detail & Related papers (2024-10-11T19:05:25Z) - Textual Data Augmentation for Patient Outcomes Prediction [67.72545656557858]
We propose a novel data augmentation method to generate artificial clinical notes in patients' Electronic Health Records.
We fine-tune the generative language model GPT-2 to synthesize labeled text with the original training data.
We evaluate our method on the most common patient outcome, i.e., the 30-day readmission rate.
arXiv Detail & Related papers (2022-11-13T01:07:23Z) - A Comprehensive Benchmark for COVID-19 Predictive Modeling Using
Electronic Health Records in Intensive Care [15.64030213048907]
We propose two clinical prediction tasks, Outcome-specific length-of-stay prediction and Early mortality prediction for COVID-19 patients in intensive care units.
The two tasks are adapted from the naive length-of-stay and mortality prediction tasks to accommodate the clinical practice for COVID-19 patients.
We propose fair, detailed, open-source data-preprocessing pipelines and evaluate 17 state-of-the-art predictive models on two tasks.
arXiv Detail & Related papers (2022-09-16T09:09:15Z) - Unsupervised pre-training of graph transformers on patient population
graphs [48.02011627390706]
We propose a graph-transformer-based network to handle heterogeneous clinical data.
We show the benefit of our pre-training method in a self-supervised and a transfer learning setting.
arXiv Detail & Related papers (2022-07-21T16:59:09Z) - Unsupervised Pre-Training on Patient Population Graphs for Patient-Level
Predictions [48.02011627390706]
Pre-training has shown success in different areas of machine learning, such as Computer Vision (CV), Natural Language Processing (NLP) and medical imaging.
In this paper, we apply unsupervised pre-training to heterogeneous, multi-modal EHR data for patient outcome prediction.
We find that our proposed graph based pre-training method helps in modeling the data at a population level.
arXiv Detail & Related papers (2022-03-23T17:59:45Z) - Bridging the Gap Between Patient-specific and Patient-independent
Seizure Prediction via Knowledge Distillation [7.2666838978096875]
Existing approaches typically train models in a patient-specific fashion due to the highly personalized characteristics of epileptic signals.
A patient-specific model can then be obtained with the help of distilled knowledge and additional personalized data.
Five state-of-the-art seizure prediction methods are trained on the CHB-MIT sEEG database with our proposed scheme.
arXiv Detail & Related papers (2022-02-25T10:30:29Z) - BERT WEAVER: Using WEight AVERaging to enable lifelong learning for
transformer-based models in biomedical semantic search engines [49.75878234192369]
We present WEAVER, a simple, yet efficient post-processing method that infuses old knowledge into the new model.
We show that applying WEAVER in a sequential manner results in similar word embedding distributions as doing a combined training on all data at once.
arXiv Detail & Related papers (2022-02-21T10:34:41Z) - Practical Challenges in Differentially-Private Federated Survival
Analysis of Medical Data [57.19441629270029]
In this paper, we take advantage of the inherent properties of neural networks to federate the process of training of survival analysis models.
In the realistic setting of small medical datasets and only a few data centers, this noise makes it harder for the models to converge.
We propose DPFed-post which adds a post-processing stage to the private federated learning scheme.
arXiv Detail & Related papers (2022-02-08T10:03:24Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Med-BERT: pre-trained contextualized embeddings on large-scale
structured electronic health records for disease prediction [12.669003066030697]
We propose Med-BERT, which adapts the BERT framework for pre-training contextualized embedding models on structured diagnosis data from 28,490,650 patients EHR dataset.
Med-BERT substantially improves prediction accuracy, boosting the area under receiver operating characteristics curve (AUC) by 2.02-7.12%.
In particular, pre-trained Med-BERT substantially improves the performance of tasks with very small fine-tuning training sets (300-500 samples) boosting the AUC by more than 20% or equivalent to the AUC of 10 times larger training set.
arXiv Detail & Related papers (2020-05-22T05:07:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.