Enhancing stroke disease classification through machine learning models via a novel voting system by feature selection techniques
- URL: http://arxiv.org/abs/2504.00485v1
- Date: Tue, 01 Apr 2025 07:16:49 GMT
- Title: Enhancing stroke disease classification through machine learning models via a novel voting system by feature selection techniques
- Authors: Mahade Hasan, Farhana Yasmin, Md. Mehedi Hassan, Xue Yu, Soniya Yeasmin, Herat Joshi, Sheikh Mohammed Shariful Islam,
- Abstract summary: Heart disease remains a leading cause of morbidity and mortality worldwide.<n>We have developed a novel voting system with feature selection techniques to advance heart disease classification.<n>XGBoost demonstrated exceptional performance, achieving 99% accuracy, precision, F1-Score, 98% recall, and 100% ROC AUC.
- Score: 1.2302586529345994
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Heart disease remains a leading cause of mortality and morbidity worldwide, necessitating the development of accurate and reliable predictive models to facilitate early detection and intervention. While state of the art work has focused on various machine learning approaches for predicting heart disease, but they could not able to achieve remarkable accuracy. In response to this need, we applied nine machine learning algorithms XGBoost, logistic regression, decision tree, random forest, k-nearest neighbors (KNN), support vector machine (SVM), gaussian na\"ive bayes (NB gaussian), adaptive boosting, and linear regression to predict heart disease based on a range of physiological indicators. Our approach involved feature selection techniques to identify the most relevant predictors, aimed at refining the models to enhance both performance and interpretability. The models were trained, incorporating processes such as grid search hyperparameter tuning, and cross-validation to minimize overfitting. Additionally, we have developed a novel voting system with feature selection techniques to advance heart disease classification. Furthermore, we have evaluated the models using key performance metrics including accuracy, precision, recall, F1-score, and the area under the receiver operating characteristic curve (ROC AUC). Among the models, XGBoost demonstrated exceptional performance, achieving 99% accuracy, precision, F1-Score, 98% recall, and 100% ROC AUC. This study offers a promising approach to early heart disease diagnosis and preventive healthcare.
Related papers
- Congenital Heart Disease Classification Using Phonocardiograms: A Scalable Screening Tool for Diverse Environments [34.10187730651477]
Congenital heart disease (CHD) is a critical condition that demands early detection.<n>This study presents a deep learning model designed to detect CHD using phonocardiogram (PCG) signals.<n>We evaluated our model on several datasets, including the primary dataset from Bangladesh.
arXiv Detail & Related papers (2025-03-28T05:47:44Z) - Feature selection strategies for optimized heart disease diagnosis using ML and DL models [4.863856267150165]
This study evaluates the impact of feature selection techniques on the predictive performance of various machine learning (ML) and deep learning (DL) models.<n>Eleven ML/DL models were assessed using metrics such as precision, recall, AUC score, F1-score, and accuracy.<n>Results indicate that MI outperformed other methods, particularly for advanced models like neural networks.
arXiv Detail & Related papers (2025-03-20T09:59:01Z) - Efficient Precision Control in Object Detection Models for Enhanced and Reliable Ovarian Follicle Counting [37.9434503914985]
A major challenge for machine learning is to control the precision of predictions while enabling a high recall.<n>We use a multiple testing procedure that gives an overperforming way to solve the standard Precision-Recall trade-off.<n>As it is model-agnostic, this contextual selection procedure paves the way to the development of a strategy that can improve the performance of any model without the need of retraining it.
arXiv Detail & Related papers (2025-01-23T19:04:47Z) - Optimizing Mortality Prediction for ICU Heart Failure Patients: Leveraging XGBoost and Advanced Machine Learning with the MIMIC-III Database [1.5186937600119894]
Heart failure affects millions of people worldwide, significantly reducing quality of life and leading to high mortality rates.
Despite extensive research, the relationship between heart failure and mortality rates among ICU patients is not fully understood.
This study analyzed data from 1,177 patients over 18 years old from the MIMIC-III database, identified using ICD-9 codes.
arXiv Detail & Related papers (2024-09-03T07:57:08Z) - Improving Machine Learning Based Sepsis Diagnosis Using Heart Rate Variability [0.0]
This study aims to use heart rate variability (HRV) features to develop an effective predictive model for sepsis detection.
A neural network model is trained on the HRV features, achieving an F1 score of 0.805, a precision of 0.851, and a recall of 0.763.
arXiv Detail & Related papers (2024-08-01T01:47:29Z) - A data balancing approach towards design of an expert system for Heart Disease Prediction [0.9895793818721335]
Heart disease is a serious global health issue that claims millions of lives every year.
We employed five machine learning methods in this paper: Decision Tree (DT), Random Forest (RF), Linear Discriminant Analysis, Extra TreeBoost, and AdaBoost.
The accuracy of the Random Forest and Decision Tree model was 99.83%.
arXiv Detail & Related papers (2024-07-26T08:56:13Z) - Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options.
The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z) - Predictive Modeling for Breast Cancer Classification in the Context of Bangladeshi Patients: A Supervised Machine Learning Approach with Explainable AI [0.0]
We evaluate and compare the classification accuracy, precision, recall, and F-1 scores of five different machine learning methods.
XGBoost achieved the best model accuracy, which is 97%.
arXiv Detail & Related papers (2024-04-06T17:23:21Z) - Improving Diffusion Models for ECG Imputation with an Augmented Template
Prior [43.6099225257178]
noisy and poor-quality recordings are a major issue for signals collected using mobile health systems.
Recent studies have explored the imputation of missing values in ECG with probabilistic time-series models.
We present a template-guided denoising diffusion probabilistic model (DDPM), PulseDiff, which is conditioned on an informative prior for a range of health conditions.
arXiv Detail & Related papers (2023-10-24T11:34:15Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - Neuro-symbolic Neurodegenerative Disease Modeling as Probabilistic
Programmed Deep Kernels [93.58854458951431]
We present a probabilistic programmed deep kernel learning approach to personalized, predictive modeling of neurodegenerative diseases.
Our analysis considers a spectrum of neural and symbolic machine learning approaches.
We run evaluations on the problem of Alzheimer's disease prediction, yielding results that surpass deep learning.
arXiv Detail & Related papers (2020-09-16T15:16:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.