Curvature-based Feature Selection with Application in Classifying
Electronic Health Records
- URL: http://arxiv.org/abs/2101.03581v1
- Date: Sun, 10 Jan 2021 16:55:40 GMT
- Title: Curvature-based Feature Selection with Application in Classifying
Electronic Health Records
- Authors: Zheming Zuo, Jie Li, Noura Al Moubayed
- Abstract summary: We propose an efficient curvature-based feature selection method for supporting more precise diagnosis.
Our method achieves state-of-the-art performance on four benchmark healthcare data sets.
- Score: 13.427883408828642
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Electronic Health Records (EHRs) are widely applied in healthcare facilities
nowadays. Due to the inherent heterogeneity, unbalanced, incompleteness, and
high-dimensional nature of EHRs, it is a challenging task to employ machine
learning algorithms to analyse such EHRs for prediction and diagnostics within
the scope of precision medicine. Dimensionality reduction is an efficient data
preprocessing technique for the analysis of high dimensional data that reduces
the number of features while improving the performance of the data analysis,
e.g. classification. In this paper, we propose an efficient curvature-based
feature selection method for supporting more precise diagnosis. The proposed
method is a filter-based feature selection method, which directly utilises the
Menger Curvature for ranking all the attributes in the given data set. We
evaluate the performance of our method against conventional PCA and recent ones
including BPCM, GSAM, WCNN, BLS II, VIBES, 2L-MJFA, RFGA, and VAF. Our method
achieves state-of-the-art performance on four benchmark healthcare data sets
including CCRFDS, BCCDS, BTDS, and DRDDS with impressive 24.73% and 13.93%
improvements respectively on BTDS and CCRFDS, 7.97% improvement on BCCDS, and
3.63% improvement on DRDDS. Our CFS source code is publicly available at
https://github.com/zhemingzuo/CFS.
Related papers
- Enhancing Angular Resolution via Directionality Encoding and Geometric Constraints in Brain Diffusion Tensor Imaging [70.66500060987312]
Diffusion-weighted imaging (DWI) is a type of Magnetic Resonance Imaging (MRI) technique sensitised to the diffusivity of water molecules.
This work proposes DirGeo-DTI, a deep learning-based method to estimate reliable DTI metrics even from a set of DWIs acquired with the minimum theoretical number (6) of gradient directions.
arXiv Detail & Related papers (2024-09-11T11:12:26Z) - HealthGAT: Node Classifications in Electronic Health Records using Graph Attention Networks [2.2026317523029193]
HealthGAT is a graph attention network framework that generates embeddings from EHR.
Our model iteratively refines the embeddings for medical codes, resulting in improved EHR data analysis.
Our model shows outstanding performance in node classification and downstream tasks such as predicting readmissions and diagnosis classifications.
arXiv Detail & Related papers (2024-03-26T22:17:01Z) - Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - ArSDM: Colonoscopy Images Synthesis with Adaptive Refinement Semantic
Diffusion Models [69.9178140563928]
Colonoscopy analysis is essential for assisting clinical diagnosis and treatment.
The scarcity of annotated data limits the effectiveness and generalization of existing methods.
We propose an Adaptive Refinement Semantic Diffusion Model (ArSDM) to generate colonoscopy images that benefit the downstream tasks.
arXiv Detail & Related papers (2023-09-03T07:55:46Z) - Towards Reliable Medical Image Segmentation by utilizing Evidential Calibrated Uncertainty [52.03490691733464]
We introduce DEviS, an easily implementable foundational model that seamlessly integrates into various medical image segmentation networks.
By leveraging subjective logic theory, we explicitly model probability and uncertainty for the problem of medical image segmentation.
DeviS incorporates an uncertainty-aware filtering module, which utilizes the metric of uncertainty-calibrated error to filter reliable data.
arXiv Detail & Related papers (2023-01-01T05:02:46Z) - Random Data Augmentation based Enhancement: A Generalized Enhancement
Approach for Medical Datasets [8.844562557753399]
This paper develops a generalized, data-independent and computationally efficient enhancement approach to improve medical data quality for DL.
The quality is enhanced by improving the brightness and contrast of images.
Experiments have been performed with: COVID-19 chest X-ray, KiTS19, and for RGB imagery with: LC25000 datasets.
arXiv Detail & Related papers (2022-10-03T11:16:22Z) - Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based
Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E)
We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z) - Identifying Stroke Indicators Using Rough Sets [0.7340017786387767]
We propose a novel rough-set based technique for ranking the importance of the various EHR records in detecting stroke.
Age, average glucose level, heart disease, and hypertension were the most essential attributes for detecting stroke in patients.
arXiv Detail & Related papers (2021-10-19T06:04:48Z) - A Profile-Based Binary Feature Extraction Method Using Frequent Itemsets
for Improving Coronary Artery Disease Diagnosis [0.0]
This paper introduces a CAD diagnosis method with a novel feature extraction technique called the Profile-Based Binary Feature Extraction (PBBFE)
In this method, after partitioning numerical features, frequent itemsets are extracted by the Apriori algorithm and then used as features to increase the CAD diagnosis accuracy.
The proposed method was tested on the Z-Alizadeh Sani dataset, which is one the richest databases in the field of CAD.
arXiv Detail & Related papers (2021-09-22T18:33:45Z) - A novel method for Causal Structure Discovery from EHR data, a
demonstration on type-2 diabetes mellitus [3.8171820752218997]
We propose a new data transformation method and a novel causal structure discovery algorithm.
We demonstrated the proposed methods on an application to type-2 diabetes mellitus.
arXiv Detail & Related papers (2020-11-11T00:50:04Z) - ECG-DelNet: Delineation of Ambulatory Electrocardiograms with Mixed
Quality Labeling Using Neural Networks [69.25956542388653]
Deep learning (DL) algorithms are gaining weight in academic and industrial settings.
We demonstrate DL can be successfully applied to low interpretative tasks by embedding ECG detection and delineation onto a segmentation framework.
The model was trained using PhysioNet's QT database, comprised of 105 ambulatory ECG recordings.
arXiv Detail & Related papers (2020-05-11T16:29:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.