Related papers: Contrastive Learning Improves Critical Event Prediction in COVID-19 Patients

Contrastive Learning Improves Critical Event Prediction in COVID-19 Patients

URL: http://arxiv.org/abs/2101.04013v1
Date: Mon, 11 Jan 2021 16:41:13 GMT
Title: Contrastive Learning Improves Critical Event Prediction in COVID-19 Patients
Authors: Tingyi Wanyan, Hossein Honarvar, Suraj K. Jaladanki, Chengxi Zang, Nidhi Naik, Sulaiman Somani, Jessica K. De Freitas, Ishan Paranjpe, Akhil Vaid, Riccardo Miotto, Girish N. Nadkarni, Marinka Zitnik, ArifulAzad, Fei Wang, Ying Ding, Benjamin S. Glicksberg
Abstract summary: We show that contrastive loss (CL) improves the performance of cross-entropy loss (CEL) for imbalanced EHR data. This study has been approved by the Institutional Review Board at the Icahn School of Medicine at Mount Sinai.
Score: 19.419685256069666
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine Learning (ML) models typically require large-scale, balanced training data to be robust, generalizable, and effective in the context of healthcare. This has been a major issue for developing ML models for the coronavirus-disease 2019 (COVID-19) pandemic where data is highly imbalanced, particularly within electronic health records (EHR) research. Conventional approaches in ML use cross-entropy loss (CEL) that often suffers from poor margin classification. For the first time, we show that contrastive loss (CL) improves the performance of CEL especially for imbalanced EHR data and the related COVID-19 analyses. This study has been approved by the Institutional Review Board at the Icahn School of Medicine at Mount Sinai. We use EHR data from five hospitals within the Mount Sinai Health System (MSHS) to predict mortality, intubation, and intensive care unit (ICU) transfer in hospitalized COVID-19 patients over 24 and 48 hour time windows. We train two sequential architectures (RNN and RETAIN) using two loss functions (CEL and CL). Models are tested on full sample data set which contain all available data and restricted data set to emulate higher class imbalance.CL models consistently outperform CEL models with the restricted data set on these tasks with differences ranging from 0.04 to 0.15 for AUPRC and 0.05 to 0.1 for AUROC. For the restricted sample, only the CL model maintains proper clustering and is able to identify important features, such as pulse oximetry. CL outperforms CEL in instances of severe class imbalance, on three EHR outcomes with respect to three performance metrics: predictive power, clustering, and feature importance. We believe that the developed CL framework can be expanded and used for EHR ML work in general.

Related papers

Uncertainty-Aware Deep Learning for Automated Skin Cancer Classification: A Comprehensive Evaluation [11.342661330086921]
We present a comprehensive evaluation of deep learning-based skin lesion classification using transfer learning and uncertainty quantification (UQ) on the HAM10000 dataset.<n>Results show that CLIP-based vision transformers, particularly LAION CLIP ViT-H/14 with SVM, deliver the highest classification performance.<n>This study highlights the importance of integrating UQ into DL-based medical diagnosis to enhance both performance and trustworthiness in real-world clinical applications.
arXiv Detail & Related papers (2025-06-12T02:29:16Z)
Prediction of Lung Metastasis from Hepatocellular Carcinoma using the SEER Database [0.9055332067000195]
Hepatocellular carcinoma (HCC) is a leading cause of cancer-related mortality. predictive models for lung metastasis inHCC remain limited in scope and clinical applicability. We develop and validate an end-to-end machine learning pipeline using data from the Surveillance, Epidemiology, and End Results (SEER) database.
arXiv Detail & Related papers (2025-01-20T20:06:31Z)
Enhancing Glucose Level Prediction of ICU Patients through Irregular Time-Series Analysis and Integrated Representation [4.101915841246237]
We develop a novel learning-based model to forecast the next level, classifying it into hypoglycemia, hyperglycemia, or euglycemia. This study focuses on predicting blood glucose levels in ICU patients, but MITST can easily be extended to other critical event prediction tasks.
arXiv Detail & Related papers (2024-11-03T03:03:11Z)
FedCVD: The First Real-World Federated Learning Benchmark on Cardiovascular Disease Data [52.55123685248105]
Cardiovascular diseases (CVDs) are currently the leading cause of death worldwide, highlighting the critical need for early diagnosis and treatment. Machine learning (ML) methods can help diagnose CVDs early, but their performance relies on access to substantial data with high quality. This paper presents the first real-world FL benchmark for cardiovascular disease detection, named FedCVD.
arXiv Detail & Related papers (2024-10-28T02:24:01Z)
CEL: A Continual Learning Model for Disease Outbreak Prediction by Leveraging Domain Adaptation via Elastic Weight Consolidation [4.693707128262634]
This study introduces a novel CEL model for continual learning by leveraging domain adaptation via Elastic Weight Consolidation (EWC) CEL's robustness and reliability are underscored by its minimal 65% forgetting rate and 18% higher memory stability compared to existing benchmark studies.
arXiv Detail & Related papers (2024-01-17T03:26:04Z)
Evaluating the Fairness of the MIMIC-IV Dataset and a Baseline Algorithm: Application to the ICU Length of Stay Prediction [65.268245109828]
This paper uses the MIMIC-IV dataset to examine the fairness and bias in an XGBoost binary classification model predicting the ICU length of stay. The research reveals class imbalances in the dataset across demographic attributes and employs data preprocessing and feature extraction. The paper concludes with recommendations for fairness-aware machine learning techniques for mitigating biases and the need for collaborative efforts among healthcare professionals and data scientists.
arXiv Detail & Related papers (2023-12-31T16:01:48Z)
Mixed-Integer Projections for Automated Data Correction of EMRs Improve Predictions of Sepsis among Hospitalized Patients [7.639610349097473]
We introduce an innovative projections-based method that seamlessly integrates clinical expertise as domain constraints. We measure the distance of corrected data from the constraints defining a healthy range of patient data, resulting in a unique predictive metric we term as "trust-scores" We show an AUROC of 0.865 and a precision of 0.922, that surpasses conventional ML models without such projections.
arXiv Detail & Related papers (2023-08-21T15:14:49Z)
Clinical Deterioration Prediction in Brazilian Hospitals Based on Artificial Neural Networks and Tree Decision Models [56.93322937189087]
An extremely boosted neural network (XBNet) is used to predict clinical deterioration (CD) The XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.
arXiv Detail & Related papers (2022-12-17T23:29:14Z)
Density-Aware Personalized Training for Risk Prediction in Imbalanced Medical Data [89.79617468457393]
Training models with imbalance rate (class density discrepancy) may lead to suboptimal prediction. We propose a framework for training models for this imbalance issue. We demonstrate our model's improved performance in real-world medical datasets.
arXiv Detail & Related papers (2022-07-23T00:39:53Z)
Bootstrapping Your Own Positive Sample: Contrastive Learning With Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model. We introduce two unique positive sampling strategies specifically tailored for EHR data. Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z)
Predicting Hyperkalemia in the ICU and Evaluation of Generalizability and Interpretability [5.9854349801427285]
Hyperkalemia is a potentially life-threatening condition that can lead to fatal arrhythmias. We developed predictive models to identify intensive care unit (ICU) patients at risk of developing hyperkalemia. Our models were able to predict hyperkalemia with an AUC of (i) 0.79, 0.81, 0.81 and (ii) 0.81, 0.85, 0.85 for LR, RF, and XGBoost respectively.
arXiv Detail & Related papers (2021-01-16T12:35:27Z)
Hemogram Data as a Tool for Decision-making in COVID-19 Management: Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure. This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients. Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.