Explainable LightGBM Approach for Predicting Myocardial Infarction Mortality
- URL: http://arxiv.org/abs/2404.15029v1
- Date: Tue, 23 Apr 2024 13:35:22 GMT
- Title: Explainable LightGBM Approach for Predicting Myocardial Infarction Mortality
- Authors: Ana LetÃcia Garcez Vicente, Roseval Donisete Malaquias Junior, Roseli A. F. Romero,
- Abstract summary: Myocardial Infarction is a main cause of mortality globally, and accurate risk prediction is crucial for improving patient outcomes.
In this article, we investigate the impact of the data preprocessing task and compare three ensembles boosted tree methods to predict the risk of mortality.
Our approach achieved a superior performance when compared to other existing machine learning approaches, with an F1-score of 91,2% and an accuracy of 91,8% for LightGBM without data preprocessing.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Myocardial Infarction is a main cause of mortality globally, and accurate risk prediction is crucial for improving patient outcomes. Machine Learning techniques have shown promise in identifying high-risk patients and predicting outcomes. However, patient data often contain vast amounts of information and missing values, posing challenges for feature selection and imputation methods. In this article, we investigate the impact of the data preprocessing task and compare three ensembles boosted tree methods to predict the risk of mortality in patients with myocardial infarction. Further, we use the Tree Shapley Additive Explanations method to identify relationships among all the features for the performed predictions, leveraging the entirety of the available data in the analysis. Notably, our approach achieved a superior performance when compared to other existing machine learning approaches, with an F1-score of 91,2% and an accuracy of 91,8% for LightGBM without data preprocessing.
Related papers
- Deciphering Cardiac Destiny: Unveiling Future Risks Through Cutting-Edge Machine Learning Approaches [0.0]
This project aims to develop and assess predictive models for the timely identification of cardiac arrest incidents.
We employ machine learning algorithms like XGBoost, Gradient Boosting, and Naive Bayes, alongside a deep learning (DL) approach with Recurrent Neural Networks (RNNs)
Rigorous experimentation and validation revealed the superior performance of the RNN model.
arXiv Detail & Related papers (2024-09-03T19:18:16Z) - Optimizing Mortality Prediction for ICU Heart Failure Patients: Leveraging XGBoost and Advanced Machine Learning with the MIMIC-III Database [1.5186937600119894]
Heart failure affects millions of people worldwide, significantly reducing quality of life and leading to high mortality rates.
Despite extensive research, the relationship between heart failure and mortality rates among ICU patients is not fully understood.
This study analyzed data from 1,177 patients over 18 years old from the MIMIC-III database, identified using ICD-9 codes.
arXiv Detail & Related papers (2024-09-03T07:57:08Z) - Data-Driven Machine Learning Approaches for Predicting In-Hospital Sepsis Mortality [0.0]
This research aims to develop an interpretable and accurate ML model to help clinical professionals predict in-hospital mortality.
We analyzed ICU patient records from the MIMIC-III database based on specific criteria and extracted relevant data.
The Random Forest model was the most effective in predicting sepsis-related in-hospital mortality.
arXiv Detail & Related papers (2024-08-03T00:28:25Z) - SepsisLab: Early Sepsis Prediction with Uncertainty Quantification and Active Sensing [67.8991481023825]
Sepsis is the leading cause of in-hospital mortality in the USA.
Existing predictive models are usually trained on high-quality data with few missing information.
For the potential high-risk patients with low confidence due to limited observations, we propose a robust active sensing algorithm.
arXiv Detail & Related papers (2024-07-24T04:47:36Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Enhancing Mortality Prediction in Heart Failure Patients: Exploring
Preprocessing Methods for Imbalanced Clinical Datasets [0.0]
Heart failure (HF) is a critical condition in which the accurate prediction of mortality plays a vital role in guiding patient management decisions.
We present a comprehensive preprocessing framework including scaling, outliers processing and resampling.
By leveraging appropriate preprocessing techniques and Machine Learning (ML) algorithms, we aim to improve mortality prediction performance for HF patients.
arXiv Detail & Related papers (2023-09-30T18:31:15Z) - Survival Prediction of Heart Failure Patients using Stacked Ensemble
Machine Learning Algorithm [0.0]
Heart failure is one of the major health hazard issues of our time and is a leading cause of death worldwide.
Data mining is the process of converting massive volumes of raw data created by the healthcare institutions into meaningful information.
Our study shows that only certain attributes collected from the patients are imperative to successfully predict the surviving possibility post heart failure.
arXiv Detail & Related papers (2021-08-30T16:42:27Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - HINT: Hierarchical Interaction Network for Trial Outcome Prediction
Leveraging Web Data [56.53715632642495]
Clinical trials face uncertain outcomes due to issues with efficacy, safety, or problems with patient recruitment.
In this paper, we propose Hierarchical INteraction Network (HINT) for more general, clinical trial outcome predictions.
arXiv Detail & Related papers (2021-02-08T15:09:07Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - Enabling Counterfactual Survival Analysis with Balanced Representations [64.17342727357618]
Survival data are frequently encountered across diverse medical applications, i.e., drug development, risk profiling, and clinical trials.
We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes.
arXiv Detail & Related papers (2020-06-14T01:15:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.