GenHPF: General Healthcare Predictive Framework with Multi-task
Multi-source Learning
- URL: http://arxiv.org/abs/2207.09858v3
- Date: Wed, 15 Nov 2023 11:47:19 GMT
- Title: GenHPF: General Healthcare Predictive Framework with Multi-task
Multi-source Learning
- Authors: Kyunghoon Hur, Jungwoo Oh, Junu Kim, Jiyoun Kim, Min Jae Lee, Eunbyeol
Cho, Seong-Eun Moon, Young-Hak Kim, Louis Atallah, Edward Choi
- Abstract summary: General Healthcare Predictive Framework (GenHPF) is applicable to any EHR with minimal preprocessing for multiple prediction tasks.
Our framework significantly outperforms baseline models that utilize domain knowledge in multi-source learning.
- Score: 9.406539794019581
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the remarkable progress in the development of predictive models for
healthcare, applying these algorithms on a large scale has been challenging.
Algorithms trained on a particular task, based on specific data formats
available in a set of medical records, tend to not generalize well to other
tasks or databases in which the data fields may differ. To address this
challenge, we propose General Healthcare Predictive Framework (GenHPF), which
is applicable to any EHR with minimal preprocessing for multiple prediction
tasks. GenHPF resolves heterogeneity in medical codes and schemas by converting
EHRs into a hierarchical textual representation while incorporating as many
features as possible. To evaluate the efficacy of GenHPF, we conduct multi-task
learning experiments with single-source and multi-source settings, on three
publicly available EHR datasets with different schemas for 12 clinically
meaningful prediction tasks. Our framework significantly outperforms baseline
models that utilize domain knowledge in multi-source learning, improving
average AUROC by 1.2%P in pooled learning and 2.6%P in transfer learning while
also showing comparable results when trained on a single EHR dataset.
Furthermore, we demonstrate that self-supervised pretraining using multi-source
datasets is effective when combined with GenHPF, resulting in a 0.6%P AUROC
improvement compared to models without pretraining. By eliminating the need for
preprocessing and feature engineering, we believe that this work offers a solid
framework for multi-task and multi-source learning that can be leveraged to
speed up the scaling and usage of predictive algorithms in healthcare.
Related papers
- Multi-Modal One-Shot Federated Ensemble Learning for Medical Data with Vision Large Language Model [27.299068494473016]
We introduce FedMME, an innovative one-shot multi-modal federated ensemble learning framework.
FedMME capitalizes on vision large language models to produce textual reports from medical images.
It surpasses existing one-shot federated learning approaches by more than 17.5% in accuracy on the RSNA dataset.
arXiv Detail & Related papers (2025-01-06T08:36:28Z) - Automated Multi-Task Learning for Joint Disease Prediction on Electronic Health Records [4.159498069487535]
We propose an automated approach named AutoDP, which can search for the optimal configuration of task grouping and architectures simultaneously.
It achieves significant performance improvements over both hand-crafted and automated state-of-the-art methods, also maintains a feasible search cost at the same time.
arXiv Detail & Related papers (2024-03-06T22:32:48Z) - Predicting Infant Brain Connectivity with Federated Multi-Trajectory
GNNs using Scarce Data [54.55126643084341]
Existing deep learning solutions suffer from three major limitations.
We introduce FedGmTE-Net++, a federated graph-based multi-trajectory evolution network.
Using the power of federation, we aggregate local learnings among diverse hospitals with limited datasets.
arXiv Detail & Related papers (2024-01-01T10:20:01Z) - Bayesian Meta-Learning for Improving Generalizability of Health Prediction Models With Similar Causal Mechanisms [14.4598538769316]
We introduce a novel Bayesian meta-learning approach that aims to address challenges of negative transfer during shared learning and poor generalizability to new patients.
Our main contribution is in modeling similarity between causal mechanisms of the tasks, for (1) mitigating negative transfer during training and (2) fine-tuning that pools information from tasks that are expected to aid generalizability.
arXiv Detail & Related papers (2023-10-19T09:03:41Z) - Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach.
Our approach is easy to integrate into any hybrid model and requires no external training data.
Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z) - Time Associated Meta Learning for Clinical Prediction [78.99422473394029]
We propose a novel time associated meta learning (TAML) method to make effective predictions at multiple future time points.
To address the sparsity problem after task splitting, TAML employs a temporal information sharing strategy to augment the number of positive samples.
We demonstrate the effectiveness of TAML on multiple clinical datasets, where it consistently outperforms a range of strong baselines.
arXiv Detail & Related papers (2023-03-05T03:54:54Z) - Unsupervised Pre-Training on Patient Population Graphs for Patient-Level
Predictions [48.02011627390706]
Pre-training has shown success in different areas of machine learning, such as Computer Vision (CV), Natural Language Processing (NLP) and medical imaging.
In this paper, we apply unsupervised pre-training to heterogeneous, multi-modal EHR data for patient outcome prediction.
We find that our proposed graph based pre-training method helps in modeling the data at a population level.
arXiv Detail & Related papers (2022-03-23T17:59:45Z) - Pre-training transformer-based framework on large-scale pediatric claims
data for downstream population-specific tasks [3.1580072841682734]
This study presents the Claim Pre-Training (Claim-PT) framework, a generic pre-training model that first trains on the entire pediatric claims dataset.
The effective knowledge transfer is completed through the task-aware fine-tuning stage.
We conducted experiments on a real-world claims dataset with more than one million patient records.
arXiv Detail & Related papers (2021-06-24T15:25:41Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.