Explainable Multi-class Classification of Medical Data
- URL: http://arxiv.org/abs/2012.13796v1
- Date: Sat, 26 Dec 2020 18:56:07 GMT
- Title: Explainable Multi-class Classification of Medical Data
- Authors: YuanZheng Hu, Marina Sokolova
- Abstract summary: We present explainable multi-class classification of a large medical data set.
Six algorithms are used in this study: Support Vector Machine (SVM), Na"ive Bayes, Gradient Boosting, Decision Trees, Random Forest, and Logistic Regression.
Our results show that using 23 medication features in learning experiments improves Recall of five out of the six applied learning algorithms.
- Score: 0.9137554315375922
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine Learning applications have brought new insights into a secondary
analysis of medical data. Machine Learning helps to develop new drugs, define
populations susceptible to certain illnesses, identify predictors of many
common diseases. At the same time, Machine Learning results depend on
convolution of many factors, including feature selection, class (im)balance,
algorithm preference, and performance metrics. In this paper, we present
explainable multi-class classification of a large medical data set. We in
details discuss knowledge-based feature engineering, data set balancing, best
model selection, and parameter tuning. Six algorithms are used in this study:
Support Vector Machine (SVM), Na\"ive Bayes, Gradient Boosting, Decision Trees,
Random Forest, and Logistic Regression. Our empirical evaluation is done on the
UCI Diabetes 130-US hospitals for years 1999-2008 dataset, with the task to
classify patient hospital re-admission stay into three classes: 0 days, <30
days, or > 30 days. Our results show that using 23 medication features in
learning experiments improves Recall of five out of the six applied learning
algorithms. This is a new result that expands the previous studies conducted on
the same data. Gradient Boosting and Random Forest outperformed other
algorithms in terms of the three-class classification Accuracy.
Related papers
- Predictive Modeling for Breast Cancer Classification in the Context of Bangladeshi Patients: A Supervised Machine Learning Approach with Explainable AI [0.0]
We evaluate and compare the classification accuracy, precision, recall, and F-1 scores of five different machine learning methods.
XGBoost achieved the best model accuracy, which is 97%.
arXiv Detail & Related papers (2024-04-06T17:23:21Z) - Comparison of Machine Learning Classification Algorithms and Application
to the Framingham Heart Study [0.0]
The use of machine learning algorithms in healthcare can amplify social injustices and health inequities.
This research pertains to some generalizability impediments that occur during the development and the post-deployment of machine learning classification algorithms.
arXiv Detail & Related papers (2024-02-22T22:49:35Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Does Deep Learning REALLY Outperform Non-deep Machine Learning for
Clinical Prediction on Physiological Time Series? [11.901347806586234]
We systematically examine the performance of machine learning models for the clinical prediction task based on the EHR.
Ten baseline machine learning models are compared, including 3 deep learning methods and 7 non-deep learning methods.
The results show that deep learning indeed outperforms non-deep learning, but with certain conditions.
arXiv Detail & Related papers (2022-11-11T07:09:49Z) - Analyzing Wearables Dataset to Predict ADLs and Falls: A Pilot Study [0.0]
This paper exhaustively reviews thirty-nine wearable based datasets which can be used for evaluating the system to recognize Activities of Daily Living and Falls.
A comparative analysis on the SisFall dataset using five machine learning methods is performed in python.
The results obtained from this study proves that KNN outperforms other machine learning methods in terms of accuracy, precision and recall.
arXiv Detail & Related papers (2022-09-11T04:41:40Z) - Continual Learning with Bayesian Model based on a Fixed Pre-trained
Feature Extractor [55.9023096444383]
Current deep learning models are characterised by catastrophic forgetting of old knowledge when learning new classes.
Inspired by the process of learning new knowledge in human brains, we propose a Bayesian generative model for continual learning.
arXiv Detail & Related papers (2022-04-28T08:41:51Z) - LifeLonger: A Benchmark for Continual Disease Classification [59.13735398630546]
We introduce LifeLonger, a benchmark for continual disease classification on the MedMNIST collection.
Task and class incremental learning of diseases address the issue of classifying new samples without re-training the models from scratch.
Cross-domain incremental learning addresses the issue of dealing with datasets originating from different institutions while retaining the previously obtained knowledge.
arXiv Detail & Related papers (2022-04-12T12:25:05Z) - The Severity Prediction of The Binary And Multi-Class Cardiovascular
Disease -- A Machine Learning-Based Fusion Approach [0.0]
Recently CVDs, or cardiovascular disease, have become a leading cause of death around the world.
In this research, some fusion models have been constructed to diagnose CVDs along with its severity.
The highest accuracy for multiclass classification was found as 75%, and it was 95% for binary.
arXiv Detail & Related papers (2022-03-09T18:06:24Z) - Relational Subsets Knowledge Distillation for Long-tailed Retinal
Diseases Recognition [65.77962788209103]
We propose class subset learning by dividing the long-tailed data into multiple class subsets according to prior knowledge.
It enforces the model to focus on learning the subset-specific knowledge.
The proposed framework proved to be effective for the long-tailed retinal diseases recognition task.
arXiv Detail & Related papers (2021-04-22T13:39:33Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.