Related papers: Machine learning algorithms to predict stroke in China based on causal inference of time series analysis

Machine learning algorithms to predict stroke in China based on causal inference of time series analysis

URL: http://arxiv.org/abs/2503.14512v1
Date: Mon, 10 Mar 2025 14:45:43 GMT
Title: Machine learning algorithms to predict stroke in China based on causal inference of time series analysis
Authors: Qizhi Zheng, Ayang Zhao, Xinzhu Wang, Yanhong Bai, Zikun Wang, Xiuying Wang, Xianzhang Zeng, Guanghui Dong,
Abstract summary: This study proposes a stroke risk prediction method that combines dynamic causal inference with machine learning models.<n>The research results indicate that dynamic causal inference features have important value in predicting stroke risk.
Score: 1.7646715816998508
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Participants: This study employed a combination of Vector Autoregression (VAR) model and Graph Neural Networks (GNN) to systematically construct dynamic causal inference. Multiple classic classification algorithms were compared, including Random Forest, Logistic Regression, XGBoost, Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Gradient Boosting, and Multi Layer Perceptron (MLP). The SMOTE algorithm was used to undersample a small number of samples and employed Stratified K-fold Cross Validation. Results: This study included a total of 11,789 participants, including 6,334 females (53.73%) and 5,455 males (46.27%), with an average age of 65 years. Introduction of dynamic causal inference features has significantly improved the performance of almost all models. The area under the ROC curve of each model ranged from 0.78 to 0.83, indicating significant difference (P < 0.01). Among all the models, the Gradient Boosting model demonstrated the highest performance and stability. Model explanation and feature importance analysis generated model interpretation that illustrated significant contributors associated with risks of stroke. Conclusions and Relevance: This study proposes a stroke risk prediction method that combines dynamic causal inference with machine learning models, significantly improving prediction accuracy and revealing key health factors that affect stroke. The research results indicate that dynamic causal inference features have important value in predicting stroke risk, especially in capturing the impact of changes in health status over time on stroke risk. By further optimizing the model and introducing more variables, this study provides theoretical basis and practical guidance for future stroke prevention and intervention strategies.

Related papers

Impact, Causation and Prediction of Socio-Academic and Economic Factors in Exam-centric Student Evaluation Measures using Machine Learning and Causal Analysis [1.124958340749622]
socio-academic and economic factors influencing students' performance is crucial for effective educational interventions.<n>This study employs several machine learning techniques and causal analysis to predict and elucidate the impacts of these factors on academic performance.
arXiv Detail & Related papers (2025-05-22T17:41:05Z)
Advancing Tabular Stroke Modelling Through a Novel Hybrid Architecture and Feature-Selection Synergy [0.9999629695552196]
The present work develops and validates a data-driven and interpretable machine-learning framework designed to predict strokes.<n>Ten routinely gathered demographic, lifestyle, and clinical variables were sourced from a public cohort of 4,981 records.<n>The proposed model achieved an accuracy rate of 97.2% and an F1-score of 97.15%, indicating a significant enhancement compared to the leading individual model.
arXiv Detail & Related papers (2025-05-18T21:46:45Z)
Machine Learning-Based Model for Postoperative Stroke Prediction in Coronary Artery Disease [0.0]
This study aims to develop and evaluate a sophisticated machine learning prediction model to assess postoperative stroke risk.<n>The dataset has 70% training and 30% test. Numerical values were normalized, whereas categorical variables were one-hot encoded.<n> Logistic Regression, XGBoost, SVM, and CatBoost were employed for predictive modeling, and SHAP analysis assessed stroke risk for each variable.
arXiv Detail & Related papers (2025-03-15T02:50:32Z)
Supervised Score-Based Modeling by Gradient Boosting [49.556736252628745]
We propose a Supervised Score-based Model (SSM) which can be viewed as a gradient boosting algorithm combining score matching.<n>We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy.<n>Our model outperforms existing models in both accuracy and inference time.
arXiv Detail & Related papers (2024-11-02T07:06:53Z)
A comparative study on feature selection for a risk prediction model for colorectal cancer [0.0]
This work is focused on colorectal cancer, assessing several feature ranking algorithms in terms of performance for a set of risk prediction models. A visual approach proposed in this work allows to see that the Neural Network-based wrapper ranking is the most unstable while the Random Forest is the most stable.
arXiv Detail & Related papers (2024-02-07T22:14:14Z)
The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation. We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare. Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z)
Learning Clinical Concepts for Predicting Risk of Progression to Severe COVID-19 [17.781861866125023]
Using data from a major healthcare provider, we develop survival models predicting severe COVID-19 progression. We develop two sets of high-performance risk scores: (i) an unconstrained model built from all available features; and (ii) a pipeline that learns a small set of clinical concepts before training a risk predictor.
arXiv Detail & Related papers (2022-08-28T02:59:35Z)
Improving Prediction of Cognitive Performance using Deep Neural Networks in Sparse Data [2.867517731896504]
We used data from an observational, cohort study, Midlife in the United States (MIDUS) to model executive function and episodic memory measures. Deep neural network (DNN) models consistently ranked highest in all of the cognitive performance prediction tasks.
arXiv Detail & Related papers (2021-12-28T22:23:08Z)
A multi-stage machine learning model on diagnosis of esophageal manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage. This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z)
Bootstrapping Your Own Positive Sample: Contrastive Learning With Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model. We introduce two unique positive sampling strategies specifically tailored for EHR data. Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z)
The Consequences of the Framing of Machine Learning Risk Prediction Models: Evaluation of Sepsis in General Wards [0.0]
We evaluate how framing affects model performance and model learning in four different approaches. We analysed structured secondary healthcare data from 221,283 citizens from four Danish municipalities.
arXiv Detail & Related papers (2021-01-26T14:00:05Z)
From Sound Representation to Model Robustness [82.21746840893658]
We investigate the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network. Averaged over various experiments on three environmental sound datasets, we found the ResNet-18 model outperforms other deep learning architectures.
arXiv Detail & Related papers (2020-07-27T17:30:49Z)
Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear. We show that it commonly arises in parameters of discrete multiplicative noise due to variance. A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.