Unsupervised pre-training of graph transformers on patient population
graphs
- URL: http://arxiv.org/abs/2207.10603v2
- Date: Mon, 17 Jul 2023 10:00:59 GMT
- Title: Unsupervised pre-training of graph transformers on patient population
graphs
- Authors: Chantal Pellegrini, Nassir Navab, Anees Kazi
- Abstract summary: We propose a graph-transformer-based network to handle heterogeneous clinical data.
We show the benefit of our pre-training method in a self-supervised and a transfer learning setting.
- Score: 48.02011627390706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-training has shown success in different areas of machine learning, such
as Computer Vision, Natural Language Processing (NLP), and medical imaging.
However, it has not been fully explored for clinical data analysis. An immense
amount of clinical records are recorded, but still, data and labels can be
scarce for data collected in small hospitals or dealing with rare diseases. In
such scenarios, pre-training on a larger set of unlabelled clinical data could
improve performance. In this paper, we propose novel unsupervised pre-training
techniques designed for heterogeneous, multi-modal clinical data for patient
outcome prediction inspired by masked language modeling (MLM), by leveraging
graph deep learning over population graphs. To this end, we further propose a
graph-transformer-based network, designed to handle heterogeneous clinical
data. By combining masking-based pre-training with a transformer-based network,
we translate the success of masking-based pre-training in other domains to
heterogeneous clinical data. We show the benefit of our pre-training method in
a self-supervised and a transfer learning setting, utilizing three medical
datasets TADPOLE, MIMIC-III, and a Sepsis Prediction Dataset. We find that our
proposed pre-training methods help in modeling the data at a patient and
population level and improve performance in different fine-tuning tasks on all
datasets.
Related papers
- An Efficient Contrastive Unimodal Pretraining Method for EHR Time Series Data [35.943089444017666]
We propose an efficient method of contrastive pretraining tailored for long clinical timeseries data.
Our model demonstrates the ability to impute missing measurements, providing clinicians with deeper insights into patient conditions.
arXiv Detail & Related papers (2024-10-11T19:05:25Z) - Multimodal Pretraining of Medical Time Series and Notes [45.89025874396911]
Deep learning models show promise in extracting meaningful patterns, but they require extensive labeled data.
We propose a novel approach employing self-supervised pretraining, focusing on the alignment of clinical measurements and notes.
In downstream tasks, including in-hospital mortality prediction and phenotyping, our model outperforms baselines in settings where only a fraction of the data is labeled.
arXiv Detail & Related papers (2023-12-11T21:53:40Z) - Self-Distillation for Further Pre-training of Transformers [83.84227016847096]
We propose self-distillation as a regularization for a further pre-training stage.
We empirically validate the efficacy of self-distillation on a variety of benchmark datasets for image and text classification tasks.
arXiv Detail & Related papers (2022-09-30T02:25:12Z) - Unsupervised Pre-Training on Patient Population Graphs for Patient-Level
Predictions [48.02011627390706]
Pre-training has shown success in different areas of machine learning, such as Computer Vision (CV), Natural Language Processing (NLP) and medical imaging.
In this paper, we apply unsupervised pre-training to heterogeneous, multi-modal EHR data for patient outcome prediction.
We find that our proposed graph based pre-training method helps in modeling the data at a population level.
arXiv Detail & Related papers (2022-03-23T17:59:45Z) - A Real Use Case of Semi-Supervised Learning for Mammogram Classification
in a Local Clinic of Costa Rica [0.5541644538483946]
Training a deep learning model requires a considerable amount of labeled images.
A number of publicly available datasets have been built with data from different hospitals and clinics.
The use of the semi-supervised deep learning approach known as MixMatch, to leverage the usage of unlabeled data is proposed and evaluated.
arXiv Detail & Related papers (2021-07-24T22:26:50Z) - Pre-training transformer-based framework on large-scale pediatric claims
data for downstream population-specific tasks [3.1580072841682734]
This study presents the Claim Pre-Training (Claim-PT) framework, a generic pre-training model that first trains on the entire pediatric claims dataset.
The effective knowledge transfer is completed through the task-aware fine-tuning stage.
We conducted experiments on a real-world claims dataset with more than one million patient records.
arXiv Detail & Related papers (2021-06-24T15:25:41Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.