Related papers: GenHPF: General Healthcare Predictive Framework with Multi-task Multi-source Learning

GenHPF: General Healthcare Predictive Framework with Multi-task Multi-source Learning

URL: http://arxiv.org/abs/2207.09858v3
Date: Wed, 15 Nov 2023 11:47:19 GMT
Title: GenHPF: General Healthcare Predictive Framework with Multi-task Multi-source Learning
Authors: Kyunghoon Hur, Jungwoo Oh, Junu Kim, Jiyoun Kim, Min Jae Lee, Eunbyeol Cho, Seong-Eun Moon, Young-Hak Kim, Louis Atallah, Edward Choi
Abstract summary: General Healthcare Predictive Framework (GenHPF) is applicable to any EHR with minimal preprocessing for multiple prediction tasks. Our framework significantly outperforms baseline models that utilize domain knowledge in multi-source learning.
Score: 9.406539794019581
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the remarkable progress in the development of predictive models for healthcare, applying these algorithms on a large scale has been challenging. Algorithms trained on a particular task, based on specific data formats available in a set of medical records, tend to not generalize well to other tasks or databases in which the data fields may differ. To address this challenge, we propose General Healthcare Predictive Framework (GenHPF), which is applicable to any EHR with minimal preprocessing for multiple prediction tasks. GenHPF resolves heterogeneity in medical codes and schemas by converting EHRs into a hierarchical textual representation while incorporating as many features as possible. To evaluate the efficacy of GenHPF, we conduct multi-task learning experiments with single-source and multi-source settings, on three publicly available EHR datasets with different schemas for 12 clinically meaningful prediction tasks. Our framework significantly outperforms baseline models that utilize domain knowledge in multi-source learning, improving average AUROC by 1.2%P in pooled learning and 2.6%P in transfer learning while also showing comparable results when trained on a single EHR dataset. Furthermore, we demonstrate that self-supervised pretraining using multi-source datasets is effective when combined with GenHPF, resulting in a 0.6%P AUROC improvement compared to models without pretraining. By eliminating the need for preprocessing and feature engineering, we believe that this work offers a solid framework for multi-task and multi-source learning that can be leveraged to speed up the scaling and usage of predictive algorithms in healthcare.

Related papers

Multi-Modal One-Shot Federated Ensemble Learning for Medical Data with Vision Large Language Model [27.299068494473016]
We introduce FedMME, an innovative one-shot multi-modal federated ensemble learning framework. FedMME capitalizes on vision large language models to produce textual reports from medical images. It surpasses existing one-shot federated learning approaches by more than 17.5% in accuracy on the RSNA dataset.
arXiv Detail & Related papers (2025-01-06T08:36:28Z)
MEDS-Tab: Automated tabularization and baseline methods for MEDS datasets [2.8209943093430443]
This work is powered by complementary advances in core data standardization through the MEDS framework. We dramatically simplify and accelerate this process of scalably featurizing irregularly sampled time-series data. This system will greatly enhance the reliability, scalable, and ease of development of powerful ML solutions for health problems across diverse datasets and clinical settings.
arXiv Detail & Related papers (2024-10-31T20:36:37Z)
Automated Multi-Task Learning for Joint Disease Prediction on Electronic Health Records [4.159498069487535]
We propose an automated approach named AutoDP, which can search for the optimal configuration of task grouping and architectures simultaneously. It achieves significant performance improvements over both hand-crafted and automated state-of-the-art methods, also maintains a feasible search cost at the same time.
arXiv Detail & Related papers (2024-03-06T22:32:48Z)
Predicting Infant Brain Connectivity with Federated Multi-Trajectory GNNs using Scarce Data [54.55126643084341]
Existing deep learning solutions suffer from three major limitations. We introduce FedGmTE-Net++, a federated graph-based multi-trajectory evolution network. Using the power of federation, we aggregate local learnings among diverse hospitals with limited datasets.
arXiv Detail & Related papers (2024-01-01T10:20:01Z)
Bayesian Meta-Learning for Improving Generalizability of Health Prediction Models With Similar Causal Mechanisms [14.4598538769316]
We introduce a novel Bayesian meta-learning approach that aims to address challenges of negative transfer during shared learning and poor generalizability to new patients. Our main contribution is in modeling similarity between causal mechanisms of the tasks, for (1) mitigating negative transfer during training and (2) fine-tuning that pools information from tasks that are expected to aid generalizability.
arXiv Detail & Related papers (2023-10-19T09:03:41Z)
Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach. Our approach is easy to integrate into any hybrid model and requires no external training data. Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z)
Time Associated Meta Learning for Clinical Prediction [78.99422473394029]
We propose a novel time associated meta learning (TAML) method to make effective predictions at multiple future time points. To address the sparsity problem after task splitting, TAML employs a temporal information sharing strategy to augment the number of positive samples. We demonstrate the effectiveness of TAML on multiple clinical datasets, where it consistently outperforms a range of strong baselines.
arXiv Detail & Related papers (2023-03-05T03:54:54Z)
Active learning using adaptable task-based prioritisation [7.0002224852386545]
We develop a controller neural network that measures priority of images in a sequence of batches, as in batch-mode active learning. A meta-reinforcement learning algorithm is proposed with multiple MDPs, such that the pre-trained controller can be adapted to a new MDP. We show that the proposed adaptable prioritisation metric yields converging segmentation accuracy for the novel class of kidney.
arXiv Detail & Related papers (2022-12-03T22:37:38Z)
Unsupervised Pre-Training on Patient Population Graphs for Patient-Level Predictions [48.02011627390706]
Pre-training has shown success in different areas of machine learning, such as Computer Vision (CV), Natural Language Processing (NLP) and medical imaging. In this paper, we apply unsupervised pre-training to heterogeneous, multi-modal EHR data for patient outcome prediction. We find that our proposed graph based pre-training method helps in modeling the data at a population level.
arXiv Detail & Related papers (2022-03-23T17:59:45Z)
Pre-training transformer-based framework on large-scale pediatric claims data for downstream population-specific tasks [3.1580072841682734]
This study presents the Claim Pre-Training (Claim-PT) framework, a generic pre-training model that first trains on the entire pediatric claims dataset. The effective knowledge transfer is completed through the task-aware fine-tuning stage. We conducted experiments on a real-world claims dataset with more than one million patient records.
arXiv Detail & Related papers (2021-06-24T15:25:41Z)
Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings. We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data. We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training. We experimentally verify that the new dataset can significantly improve the ability of the learned FER model. To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.