GenHPF: General Healthcare Predictive Framework with Multi-task
Multi-source Learning
- URL: http://arxiv.org/abs/2207.09858v3
- Date: Wed, 15 Nov 2023 11:47:19 GMT
- Title: GenHPF: General Healthcare Predictive Framework with Multi-task
Multi-source Learning
- Authors: Kyunghoon Hur, Jungwoo Oh, Junu Kim, Jiyoun Kim, Min Jae Lee, Eunbyeol
Cho, Seong-Eun Moon, Young-Hak Kim, Louis Atallah, Edward Choi
- Abstract summary: General Healthcare Predictive Framework (GenHPF) is applicable to any EHR with minimal preprocessing for multiple prediction tasks.
Our framework significantly outperforms baseline models that utilize domain knowledge in multi-source learning.
- Score: 9.406539794019581
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the remarkable progress in the development of predictive models for
healthcare, applying these algorithms on a large scale has been challenging.
Algorithms trained on a particular task, based on specific data formats
available in a set of medical records, tend to not generalize well to other
tasks or databases in which the data fields may differ. To address this
challenge, we propose General Healthcare Predictive Framework (GenHPF), which
is applicable to any EHR with minimal preprocessing for multiple prediction
tasks. GenHPF resolves heterogeneity in medical codes and schemas by converting
EHRs into a hierarchical textual representation while incorporating as many
features as possible. To evaluate the efficacy of GenHPF, we conduct multi-task
learning experiments with single-source and multi-source settings, on three
publicly available EHR datasets with different schemas for 12 clinically
meaningful prediction tasks. Our framework significantly outperforms baseline
models that utilize domain knowledge in multi-source learning, improving
average AUROC by 1.2%P in pooled learning and 2.6%P in transfer learning while
also showing comparable results when trained on a single EHR dataset.
Furthermore, we demonstrate that self-supervised pretraining using multi-source
datasets is effective when combined with GenHPF, resulting in a 0.6%P AUROC
improvement compared to models without pretraining. By eliminating the need for
preprocessing and feature engineering, we believe that this work offers a solid
framework for multi-task and multi-source learning that can be leveraged to
speed up the scaling and usage of predictive algorithms in healthcare.
Related papers
- MEDS-Tab: Automated tabularization and baseline methods for MEDS datasets [2.8209943093430443]
This work is powered by complementary advances in core data standardization through the MEDS framework.
We dramatically simplify and accelerate this process of scalably featurizing irregularly sampled time-series data.
This system will greatly enhance the reliability, scalable, and ease of development of powerful ML solutions for health problems across diverse datasets and clinical settings.
arXiv Detail & Related papers (2024-10-31T20:36:37Z) - Automated Multi-Task Learning for Joint Disease Prediction on Electronic Health Records [4.159498069487535]
We propose an automated approach named AutoDP, which can search for the optimal configuration of task grouping and architectures simultaneously.
It achieves significant performance improvements over both hand-crafted and automated state-of-the-art methods, also maintains a feasible search cost at the same time.
arXiv Detail & Related papers (2024-03-06T22:32:48Z) - Predicting Infant Brain Connectivity with Federated Multi-Trajectory
GNNs using Scarce Data [54.55126643084341]
Existing deep learning solutions suffer from three major limitations.
We introduce FedGmTE-Net++, a federated graph-based multi-trajectory evolution network.
Using the power of federation, we aggregate local learnings among diverse hospitals with limited datasets.
arXiv Detail & Related papers (2024-01-01T10:20:01Z) - Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach.
Our approach is easy to integrate into any hybrid model and requires no external training data.
Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z) - Time Associated Meta Learning for Clinical Prediction [78.99422473394029]
We propose a novel time associated meta learning (TAML) method to make effective predictions at multiple future time points.
To address the sparsity problem after task splitting, TAML employs a temporal information sharing strategy to augment the number of positive samples.
We demonstrate the effectiveness of TAML on multiple clinical datasets, where it consistently outperforms a range of strong baselines.
arXiv Detail & Related papers (2023-03-05T03:54:54Z) - Active learning using adaptable task-based prioritisation [7.0002224852386545]
We develop a controller neural network that measures priority of images in a sequence of batches, as in batch-mode active learning.
A meta-reinforcement learning algorithm is proposed with multiple MDPs, such that the pre-trained controller can be adapted to a new MDP.
We show that the proposed adaptable prioritisation metric yields converging segmentation accuracy for the novel class of kidney.
arXiv Detail & Related papers (2022-12-03T22:37:38Z) - Unsupervised Pre-Training on Patient Population Graphs for Patient-Level
Predictions [48.02011627390706]
Pre-training has shown success in different areas of machine learning, such as Computer Vision (CV), Natural Language Processing (NLP) and medical imaging.
In this paper, we apply unsupervised pre-training to heterogeneous, multi-modal EHR data for patient outcome prediction.
We find that our proposed graph based pre-training method helps in modeling the data at a population level.
arXiv Detail & Related papers (2022-03-23T17:59:45Z) - Pre-training transformer-based framework on large-scale pediatric claims
data for downstream population-specific tasks [3.1580072841682734]
This study presents the Claim Pre-Training (Claim-PT) framework, a generic pre-training model that first trains on the entire pediatric claims dataset.
The effective knowledge transfer is completed through the task-aware fine-tuning stage.
We conducted experiments on a real-world claims dataset with more than one million patient records.
arXiv Detail & Related papers (2021-06-24T15:25:41Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.