CardioTabNet: A Novel Hybrid Transformer Model for Heart Disease Prediction using Tabular Medical Data
- URL: http://arxiv.org/abs/2503.17664v1
- Date: Sat, 22 Mar 2025 06:17:08 GMT
- Title: CardioTabNet: A Novel Hybrid Transformer Model for Heart Disease Prediction using Tabular Medical Data
- Authors: Md. Shaheenur Islam Sumon, Md. Sakib Bin Islam, Md. Sohanur Rahman, Md. Sakib Abrar Hossain, Amith Khandakar, Anwarul Hasan, M Murugappan, Muhammad E. H. Chowdhury,
- Abstract summary: Our study utilizes the open-source dataset for heart disease prediction with 1190 instances and 11 features.<n>Ten machine-learning models were used to predict heart disease using selected features.<n>The top downstream model (a hyper-tuned ExtraTree) achieved an average accuracy rate of 94.1% and an average Area Under Curve (AUC) of 95.0%.
- Score: 0.46581008529871043
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The early detection and prediction of cardiovascular diseases are crucial for reducing the severe morbidity and mortality associated with these conditions worldwide. A multi-headed self-attention mechanism, widely used in natural language processing (NLP), is operated by Transformers to understand feature interactions in feature spaces. However, the relationships between various features within biological systems remain ambiguous in these spaces, highlighting the necessity of early detection and prediction of cardiovascular diseases to reduce the severe morbidity and mortality with these conditions worldwide. We handle this issue with CardioTabNet, which exploits the strength of tab transformer to extract feature space which carries strong understanding of clinical cardiovascular data and its feature ranking. As a result, performance of downstream classical models significantly showed outstanding result. Our study utilizes the open-source dataset for heart disease prediction with 1190 instances and 11 features. In total, 11 features are divided into numerical (age, resting blood pressure, cholesterol, maximum heart rate, old peak, weight, and fasting blood sugar) and categorical (resting ECG, exercise angina, and ST slope). Tab transformer was used to extract important features and ranked them using random forest (RF) feature ranking algorithm. Ten machine-learning models were used to predict heart disease using selected features. After extracting high-quality features, the top downstream model (a hyper-tuned ExtraTree classifier) achieved an average accuracy rate of 94.1% and an average Area Under Curve (AUC) of 95.0%. Furthermore, a nomogram analysis was conducted to evaluate the model's effectiveness in cardiovascular risk assessment. A benchmarking study was conducted using state-of-the-art models to evaluate our transformer-driven framework.
Related papers
- Acoustic Index: A Novel AI-Driven Parameter for Cardiac Disease Risk Stratification Using Echocardiography [0.0]
We introduce the Acoustic Index, a novel AI-derived echocardiographic parameter designed to quantify cardiac dysfunction from standard ultrasound views.<n>The model combines Extended Dynamic Mode Decomposition (EDMD) based on Koopman operator theory with a hybrid neural network that incorporates clinical metadata.<n>In a prospective cohort of 736 patients, encompassing various cardiac pathologies and normal controls, the Acoustic Index achieved an area under the curve (AUC) of 0.89 in an independent test set.<n>Cross-validation across five folds confirmed the robustness of the model, showing that both sensitivity and specificity exceeded 0.8 when evaluated on independent data.
arXiv Detail & Related papers (2025-07-17T21:27:28Z) - An Explainable AI-Enhanced Machine Learning Approach for Cardiovascular Disease Detection and Risk Assessment [0.0]
Heart disease remains a major global health concern.<n>Traditional diagnostic methods fail to accurately identify and manage heart disease risks.<n>Machine learning has the potential to significantly enhance the accuracy, efficiency, and speed of heart disease diagnosis.
arXiv Detail & Related papers (2025-07-15T10:38:38Z) - Comparative Analysis of CNN and Transformer Architectures with Heart Cycle Normalization for Automated Phonocardiogram Classification [0.44203325605537613]
Two specialized convolutional neural networks (CNNs) and two zero-shot universal audio transformers (BEATs) were evaluated.<n>A custom heart cycle normalization method tailored to individual cardiac rhythms is introduced.<n>The CNN model with fixed-length windowing achieves 79.5%, the CNN model with heart cycle normalization scores 75.4%, the BEATs transformer with fixed-length windowing achieves 65.7%, and the BEATs transformer with heart cycle normalization results in 70.1%.
arXiv Detail & Related papers (2025-07-08T13:17:26Z) - From Motion to Meaning: Biomechanics-Informed Neural Network for Explainable Cardiovascular Disease Identification [1.1142444517901016]
We utilize the energy strain formulation of Neo-Hookean material to model cardiac tissue deformations.<n>We estimate the local strains within the moving heart and extract a detailed set of features used for cardiovascular disease classification.
arXiv Detail & Related papers (2025-07-08T08:43:05Z) - Enhancing stroke disease classification through machine learning models via a novel voting system by feature selection techniques [1.2302586529345994]
Heart disease remains a leading cause of morbidity and mortality worldwide.
We have developed a novel voting system with feature selection techniques to advance heart disease classification.
XGBoost demonstrated exceptional performance, achieving 99% accuracy, precision, F1-Score, 98% recall, and 100% ROC AUC.
arXiv Detail & Related papers (2025-04-01T07:16:49Z) - Machine Learning-Based Model for Postoperative Stroke Prediction in Coronary Artery Disease [0.0]
This study aims to develop and evaluate a sophisticated machine learning prediction model to assess postoperative stroke risk.<n>The dataset has 70% training and 30% test. Numerical values were normalized, whereas categorical variables were one-hot encoded.<n> Logistic Regression, XGBoost, SVM, and CatBoost were employed for predictive modeling, and SHAP analysis assessed stroke risk for each variable.
arXiv Detail & Related papers (2025-03-15T02:50:32Z) - Finetuning and Quantization of EEG-Based Foundational BioSignal Models on ECG and PPG Data for Blood Pressure Estimation [53.2981100111204]
Photoplethysmography and electrocardiography can potentially enable continuous blood pressure (BP) monitoring.<n>Yet accurate and robust machine learning (ML) models remains challenging due to variability in data quality and patient-specific factors.<n>In this work, we investigate whether a model pre-trained on one modality can effectively be exploited to improve the accuracy of a different signal type.<n>Our approach achieves near state-of-the-art accuracy for diastolic BP and surpasses by 1.5x the accuracy of prior works for systolic BP.
arXiv Detail & Related papers (2025-02-10T13:33:12Z) - Leveraging Cardiovascular Simulations for In-Vivo Prediction of Cardiac Biomarkers [43.17768785084301]
We train an amortized neural posterior estimator on a newly built large dataset of cardiac simulations.<n>We incorporate elements modeling effects to better align simulated data with real-world measurements.<n>The proposed framework can further integrate in-vivo data sources to refine its predictive capabilities on real-world data.
arXiv Detail & Related papers (2024-12-23T13:05:17Z) - Synthetic Time Series Data Generation for Healthcare Applications: A PCG Case Study [43.28613210217385]
We employ and compare three state-of-the-art generative models to generate PCG data.<n>Our results demonstrate that the generated PCG data closely resembles the original datasets.<n>In our future work, we plan to incorporate this method into a data augmentation pipeline to synthesize abnormal PCG signals with heart murmurs.
arXiv Detail & Related papers (2024-12-17T18:07:40Z) - Optimizing Mortality Prediction for ICU Heart Failure Patients: Leveraging XGBoost and Advanced Machine Learning with the MIMIC-III Database [1.5186937600119894]
Heart failure affects millions of people worldwide, significantly reducing quality of life and leading to high mortality rates.
Despite extensive research, the relationship between heart failure and mortality rates among ICU patients is not fully understood.
This study analyzed data from 1,177 patients over 18 years old from the MIMIC-III database, identified using ICD-9 codes.
arXiv Detail & Related papers (2024-09-03T07:57:08Z) - Research on Early Warning Model of Cardiovascular Disease Based on Computer Deep Learning [5.761426161930679]
This project intends to study a cardiovascular disease risk early warning model based on one-dimensional convolutional neural networks.
The missing values of 13 physiological and symptom indicators such as patient age, blood glucose, cholesterol, and chest pain were filled and Z-score was standardized.
arXiv Detail & Related papers (2024-06-13T07:04:22Z) - Improving Diffusion Models for ECG Imputation with an Augmented Template
Prior [43.6099225257178]
noisy and poor-quality recordings are a major issue for signals collected using mobile health systems.
Recent studies have explored the imputation of missing values in ECG with probabilistic time-series models.
We present a template-guided denoising diffusion probabilistic model (DDPM), PulseDiff, which is conditioned on an informative prior for a range of health conditions.
arXiv Detail & Related papers (2023-10-24T11:34:15Z) - Identification of Ischemic Heart Disease by using machine learning
technique based on parameters measuring Heart Rate Variability [50.591267188664666]
In this study, 18 non-invasive features (age, gender, left ventricular ejection fraction and 15 obtained from HRV) of 243 subjects were used to train and validate a series of several ANN.
The best result was obtained using 7 input parameters and 7 hidden nodes with an accuracy of 98.9% and 82% for the training and validation dataset.
arXiv Detail & Related papers (2020-10-29T19:14:41Z) - Multilabel 12-Lead Electrocardiogram Classification Using Gradient
Boosting Tree Ensemble [64.29529357862955]
We build an algorithm using gradient boosted tree ensembles fitted on morphology and signal processing features to classify ECG diagnosis.
For each lead, we derive features from heart rate variability, PQRST template shape, and the full signal waveform.
We join the features of all 12 leads to fit an ensemble of gradient boosting decision trees to predict probabilities of ECG instances belonging to each class.
arXiv Detail & Related papers (2020-10-21T18:11:36Z) - Cardiac Cohort Classification based on Morphologic and Hemodynamic
Parameters extracted from 4D PC-MRI Data [6.805476759441964]
We investigate the potential of morphological and hemodynamic characteristics, extracted from measured blood flow data in the aorta, for the classification of heart-healthy volunteers and patients with bicuspid aortic valve (BAV)
In our experiments, we use several feature selection methods and classification algorithms to train separate models for the healthy subgroups and BAV patients.
arXiv Detail & Related papers (2020-10-12T11:36:04Z) - Machine-Learning-Based Multiple Abnormality Prediction with Large-Scale
Chest Computed Tomography Volumes [64.21642241351857]
We curated and analyzed a chest computed tomography (CT) data set of 36,316 volumes from 19,993 unique patients.
We developed a rule-based method for automatically extracting abnormality labels from free-text radiology reports.
We also developed a model for multi-organ, multi-disease classification of chest CT volumes.
arXiv Detail & Related papers (2020-02-12T00:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.