Comparative Analysis of Stroke Prediction Models Using Machine Learning
- URL: http://arxiv.org/abs/2505.09812v1
- Date: Wed, 14 May 2025 21:27:19 GMT
- Title: Comparative Analysis of Stroke Prediction Models Using Machine Learning
- Authors: Anastasija Tashkova, Stefan Eftimov, Bojan Ristov, Slobodan Kalajdziski,
- Abstract summary: Stroke remains one of the most critical global health challenges, ranking as the second leading cause of death and the third leading cause of disability worldwide.<n>This study explores the effectiveness of machine learning algorithms in predicting stroke risk using demographic, clinical, and lifestyle data from the Stroke Prediction dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stroke remains one of the most critical global health challenges, ranking as the second leading cause of death and the third leading cause of disability worldwide. This study explores the effectiveness of machine learning algorithms in predicting stroke risk using demographic, clinical, and lifestyle data from the Stroke Prediction Dataset. By addressing key methodological challenges such as class imbalance and missing data, we evaluated the performance of multiple models, including Logistic Regression, Random Forest, and XGBoost. Our results demonstrate that while these models achieve high accuracy, sensitivity remains a limiting factor for real-world clinical applications. In addition, we identify the most influential predictive features and propose strategies to improve machine learning-based stroke prediction. These findings contribute to the development of more reliable and interpretable models for the early assessment of stroke risk.
Related papers
- Adaptable Cardiovascular Disease Risk Prediction from Heterogeneous Data using Large Language Models [70.64969663547703]
AdaCVD is an adaptable CVD risk prediction framework built on large language models extensively fine-tuned on over half a million participants from the UK Biobank.<n>It addresses key clinical challenges across three dimensions: it flexibly incorporates comprehensive yet variable patient information; it seamlessly integrates both structured data and unstructured text; and it rapidly adapts to new patient populations using minimal additional data.
arXiv Detail & Related papers (2025-05-30T14:42:02Z) - Feature-Enhanced Machine Learning for All-Cause Mortality Prediction in Healthcare Data [0.0]
This study evaluates machine learning models for all-cause in-hospital mortality prediction using the MIMIC-III database.<n>We extracted key features such as vital signs (e.g., heart rate, blood pressure), laboratory results and demographic information.<n>The Random Forest model achieved the highest performance with an AUC of 0.94, significantly outperforming other machine learning and deep learning approaches.
arXiv Detail & Related papers (2025-03-27T08:04:42Z) - Machine Learning Applications in Medical Prognostics: A Comprehensive Review [0.0]
Machine learning (ML) has revolutionized medical prognostics by integrating advanced algorithms with clinical data.
RF models demonstrate robust performance in handling high-dimensional data.
CNNs have shown exceptional accuracy in cancer detection.
LSTM networks excel in analyzing temporal data, providing accurate predictions of clinical deterioration.
arXiv Detail & Related papers (2024-08-05T09:41:34Z) - Towards a Transportable Causal Network Model Based on Observational
Healthcare Data [1.333879175460266]
We propose a novel approach that combines selection diagrams, missingness graphs, causal discovery and prior knowledge into a single graphical model.
We learn this model from data comprising two different cohorts of patients.
The resulting causal network model is validated by expert clinicians in terms of risk assessment, accuracy and explainability.
arXiv Detail & Related papers (2023-11-13T13:23:31Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - A predictive analytics approach for stroke prediction using machine
learning and neural networks [4.984181486695979]
This paper systematically analyzes the various factors in electronic health records for effective stroke prediction.
Age, heart disease, average glucose level, and hypertension are the most important factors for detecting stroke in patients.
A perceptron neural network using these four attributes provides the highest accuracy rate and lowest miss rate.
arXiv Detail & Related papers (2022-03-01T14:45:15Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Patient-independent Epileptic Seizure Prediction using Deep Learning
Models [39.19336481493405]
The purpose of a seizure prediction system is to successfully identify the pre-ictal brain stage, which occurs before a seizure event.
Patient-independent seizure prediction models are designed to offer accurate performance across multiple subjects within a dataset.
We propose two patient-independent deep learning architectures with different learning strategies that can learn a global function utilizing data from multiple subjects.
arXiv Detail & Related papers (2020-11-18T23:13:48Z) - An Optimal Control Approach to Learning in SIDARTHE Epidemic model [67.22168759751541]
We propose a general approach for learning time-variant parameters of dynamic compartmental models from epidemic data.
We forecast the epidemic evolution in Italy and France.
arXiv Detail & Related papers (2020-10-28T10:58:59Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - Neuro-symbolic Neurodegenerative Disease Modeling as Probabilistic
Programmed Deep Kernels [93.58854458951431]
We present a probabilistic programmed deep kernel learning approach to personalized, predictive modeling of neurodegenerative diseases.
Our analysis considers a spectrum of neural and symbolic machine learning approaches.
We run evaluations on the problem of Alzheimer's disease prediction, yielding results that surpass deep learning.
arXiv Detail & Related papers (2020-09-16T15:16:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.