Feature-Enhanced Machine Learning for All-Cause Mortality Prediction in Healthcare Data
- URL: http://arxiv.org/abs/2503.21241v1
- Date: Thu, 27 Mar 2025 08:04:42 GMT
- Title: Feature-Enhanced Machine Learning for All-Cause Mortality Prediction in Healthcare Data
- Authors: HyeYoung Lee, Pavel Tsoi,
- Abstract summary: This study evaluates machine learning models for all-cause in-hospital mortality prediction using the MIMIC-III database.<n>We extracted key features such as vital signs (e.g., heart rate, blood pressure), laboratory results and demographic information.<n>The Random Forest model achieved the highest performance with an AUC of 0.94, significantly outperforming other machine learning and deep learning approaches.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate patient mortality prediction enables effective risk stratification, leading to personalized treatment plans and improved patient outcomes. However, predicting mortality in healthcare remains a significant challenge, with existing studies often focusing on specific diseases or limited predictor sets. This study evaluates machine learning models for all-cause in-hospital mortality prediction using the MIMIC-III database, employing a comprehensive feature engineering approach. Guided by clinical expertise and literature, we extracted key features such as vital signs (e.g., heart rate, blood pressure), laboratory results (e.g., creatinine, glucose), and demographic information. The Random Forest model achieved the highest performance with an AUC of 0.94, significantly outperforming other machine learning and deep learning approaches. This demonstrates Random Forest's robustness in handling high-dimensional, noisy clinical data and its potential for developing effective clinical decision support tools. Our findings highlight the importance of careful feature engineering for accurate mortality prediction. We conclude by discussing implications for clinical adoption and propose future directions, including enhancing model robustness and tailoring prediction models for specific diseases.
Related papers
- Prediction of Lung Metastasis from Hepatocellular Carcinoma using the SEER Database [0.9055332067000195]
Hepatocellular carcinoma (HCC) is a leading cause of cancer-related mortality.
predictive models for lung metastasis inHCC remain limited in scope and clinical applicability.
We develop and validate an end-to-end machine learning pipeline using data from the Surveillance, Epidemiology, and End Results (SEER) database.
arXiv Detail & Related papers (2025-01-20T20:06:31Z) - Predicting Mortality and Functional Status Scores of Traumatic Brain Injury Patients using Supervised Machine Learning [0.0]
Traumatic brain injury (TBI) presents a significant public health challenge, often resulting in mortality or lasting disability.
Predicting outcomes such as mortality and Functional Status Scale (FSS) scores can enhance treatment strategies and inform clinical decision-making.
This study applies supervised machine learning (ML) methods to predict mortality and FSS scores using a real-world dataset of 300 pediatric TBI patients.
arXiv Detail & Related papers (2024-10-27T00:44:45Z) - Advancements In Heart Disease Prediction: A Machine Learning Approach For Early Detection And Risk Assessment [0.0]
This paper comprehends, assess, and analyze the role, relevance, and efficiency of machine learning models in predicting heart disease risks using clinical data.
The Support Vector Machine (SVM) demonstrates the highest accuracy at 91.51%, confirming its superiority among the evaluated models in terms of predictive capability.
arXiv Detail & Related papers (2024-10-16T22:32:19Z) - Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval [61.70489848327436]
KARE is a novel framework that integrates knowledge graph (KG) community-level retrieval with large language models (LLMs) reasoning.
Extensive experiments demonstrate that KARE outperforms leading models by up to 10.8-15.0% on MIMIC-III and 12.6-12.7% on MIMIC-IV for mortality and readmission predictions.
arXiv Detail & Related papers (2024-10-06T18:46:28Z) - Data-Driven Machine Learning Approaches for Predicting In-Hospital Sepsis Mortality [0.0]
Sepsis is a severe condition responsible for many deaths in the United States and worldwide.<n>Previous studies employing machine learning faced limitations in feature selection and model interpretability.<n>This research aimed to develop an interpretable and accurate machine learning model to predict in-hospital sepsis mortality.
arXiv Detail & Related papers (2024-08-03T00:28:25Z) - Deep State-Space Generative Model For Correlated Time-to-Event Predictions [54.3637600983898]
We propose a deep latent state-space generative model to capture the interactions among different types of correlated clinical events.
Our method also uncovers meaningful insights about the latent correlations among mortality and different types of organ failures.
arXiv Detail & Related papers (2024-07-28T02:42:36Z) - SepsisLab: Early Sepsis Prediction with Uncertainty Quantification and Active Sensing [67.8991481023825]
Sepsis is the leading cause of in-hospital mortality in the USA.
Existing predictive models are usually trained on high-quality data with few missing information.
For the potential high-risk patients with low confidence due to limited observations, we propose a robust active sensing algorithm.
arXiv Detail & Related papers (2024-07-24T04:47:36Z) - Automatic prediction of mortality in patients with mental illness using
electronic health records [0.5957022371135096]
This paper addresses the persistent challenge of predicting mortality in patients with mental diagnoses.
Data from patients with mental disease diagnoses were extracted from the well-known clinical MIMIC-III data set.
Four machine learning algorithms were used, with results indicating that Random Forest and Support Vector Machine models outperformed others.
arXiv Detail & Related papers (2023-10-18T17:21:01Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.