Enhancing Mortality Prediction in Heart Failure Patients: Exploring
Preprocessing Methods for Imbalanced Clinical Datasets
- URL: http://arxiv.org/abs/2310.00457v1
- Date: Sat, 30 Sep 2023 18:31:15 GMT
- Title: Enhancing Mortality Prediction in Heart Failure Patients: Exploring
Preprocessing Methods for Imbalanced Clinical Datasets
- Authors: Hanif Kia, Mansour Vali, Hadi Sabahi
- Abstract summary: Heart failure (HF) is a critical condition in which the accurate prediction of mortality plays a vital role in guiding patient management decisions.
We present a comprehensive preprocessing framework including scaling, outliers processing and resampling.
By leveraging appropriate preprocessing techniques and Machine Learning (ML) algorithms, we aim to improve mortality prediction performance for HF patients.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Heart failure (HF) is a critical condition in which the accurate prediction
of mortality plays a vital role in guiding patient management decisions.
However, clinical datasets used for mortality prediction in HF often suffer
from an imbalanced distribution of classes, posing significant challenges. In
this paper, we explore preprocessing methods for enhancing one-month mortality
prediction in HF patients. We present a comprehensive preprocessing framework
including scaling, outliers processing and resampling as key techniques. We
also employed an aware encoding approach to effectively handle missing values
in clinical datasets. Our study utilizes a comprehensive dataset from the
Persian Registry Of cardio Vascular disease (PROVE) with a significant class
imbalance. By leveraging appropriate preprocessing techniques and Machine
Learning (ML) algorithms, we aim to improve mortality prediction performance
for HF patients. The results reveal an average enhancement of approximately
3.6% in F1 score and 2.7% in MCC for tree-based models, specifically Random
Forest (RF) and XGBoost (XGB). This demonstrates the efficiency of our
preprocessing approach in effectively handling Imbalanced Clinical Datasets
(ICD). Our findings hold promise in guiding healthcare professionals to make
informed decisions and improve patient outcomes in HF management.
Related papers
- Optimizing Mortality Prediction for ICU Heart Failure Patients: Leveraging XGBoost and Advanced Machine Learning with the MIMIC-III Database [1.5186937600119894]
Heart failure affects millions of people worldwide, significantly reducing quality of life and leading to high mortality rates.
Despite extensive research, the relationship between heart failure and mortality rates among ICU patients is not fully understood.
This study analyzed data from 1,177 patients over 18 years old from the MIMIC-III database, identified using ICD-9 codes.
arXiv Detail & Related papers (2024-09-03T07:57:08Z) - SepsisLab: Early Sepsis Prediction with Uncertainty Quantification and Active Sensing [67.8991481023825]
Sepsis is the leading cause of in-hospital mortality in the USA.
Existing predictive models are usually trained on high-quality data with few missing information.
For the potential high-risk patients with low confidence due to limited observations, we propose a robust active sensing algorithm.
arXiv Detail & Related papers (2024-07-24T04:47:36Z) - TCKIN: A Novel Integrated Network Model for Predicting Mortality Risk in Sepsis Patients [0.0]
Sepsis poses a major global health threat, accounting for millions of deaths annually and significant economic costs.
This study introduces the Time-Constant KAN Integrated Network(TCKIN), an innovative model that enhances the accuracy of sepsis mortality risk predictions.
arXiv Detail & Related papers (2024-07-09T05:37:50Z) - Explainable LightGBM Approach for Predicting Myocardial Infarction Mortality [0.0]
Myocardial Infarction is a main cause of mortality globally, and accurate risk prediction is crucial for improving patient outcomes.
In this article, we investigate the impact of the data preprocessing task and compare three ensembles boosted tree methods to predict the risk of mortality.
Our approach achieved a superior performance when compared to other existing machine learning approaches, with an F1-score of 91,2% and an accuracy of 91,8% for LightGBM without data preprocessing.
arXiv Detail & Related papers (2024-04-23T13:35:22Z) - Voice-Driven Mortality Prediction in Hospitalized Heart Failure Patients: A Machine Learning Approach Enhanced with Diagnostic Biomarkers [0.0]
We demonstrate a powerful and effective Machine Learning model for predicting mortality rates in heart failure patients.
By integrating voice biomarkers into routine patient monitoring, this strategy has the potential to improve patient outcomes.
In this study, a Machine Learning system is trained to predict patients' 5-year mortality rates using their speech as input.
arXiv Detail & Related papers (2024-02-21T13:50:46Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Learning to diagnose cirrhosis from radiological and histological labels
with joint self and weakly-supervised pretraining strategies [62.840338941861134]
We propose to leverage transfer learning from large datasets annotated by radiologists, to predict the histological score available on a small annex dataset.
We compare different pretraining methods, namely weakly-supervised and self-supervised ones, to improve the prediction of the cirrhosis.
This method outperforms the baseline classification of the METAVIR score, reaching an AUC of 0.84 and a balanced accuracy of 0.75.
arXiv Detail & Related papers (2023-02-16T17:06:23Z) - Density-Aware Personalized Training for Risk Prediction in Imbalanced
Medical Data [89.79617468457393]
Training models with imbalance rate (class density discrepancy) may lead to suboptimal prediction.
We propose a framework for training models for this imbalance issue.
We demonstrate our model's improved performance in real-world medical datasets.
arXiv Detail & Related papers (2022-07-23T00:39:53Z) - Survival Prediction of Heart Failure Patients using Stacked Ensemble
Machine Learning Algorithm [0.0]
Heart failure is one of the major health hazard issues of our time and is a leading cause of death worldwide.
Data mining is the process of converting massive volumes of raw data created by the healthcare institutions into meaningful information.
Our study shows that only certain attributes collected from the patients are imperative to successfully predict the surviving possibility post heart failure.
arXiv Detail & Related papers (2021-08-30T16:42:27Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.