Related papers: Explainable Multi-class Classification of Medical Data

Explainable Multi-class Classification of Medical Data

URL: http://arxiv.org/abs/2012.13796v1
Date: Sat, 26 Dec 2020 18:56:07 GMT
Title: Explainable Multi-class Classification of Medical Data
Authors: YuanZheng Hu, Marina Sokolova
Abstract summary: We present explainable multi-class classification of a large medical data set. Six algorithms are used in this study: Support Vector Machine (SVM), Na"ive Bayes, Gradient Boosting, Decision Trees, Random Forest, and Logistic Regression. Our results show that using 23 medication features in learning experiments improves Recall of five out of the six applied learning algorithms.
Score: 0.9137554315375922
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine Learning applications have brought new insights into a secondary analysis of medical data. Machine Learning helps to develop new drugs, define populations susceptible to certain illnesses, identify predictors of many common diseases. At the same time, Machine Learning results depend on convolution of many factors, including feature selection, class (im)balance, algorithm preference, and performance metrics. In this paper, we present explainable multi-class classification of a large medical data set. We in details discuss knowledge-based feature engineering, data set balancing, best model selection, and parameter tuning. Six algorithms are used in this study: Support Vector Machine (SVM), Na\"ive Bayes, Gradient Boosting, Decision Trees, Random Forest, and Logistic Regression. Our empirical evaluation is done on the UCI Diabetes 130-US hospitals for years 1999-2008 dataset, with the task to classify patient hospital re-admission stay into three classes: 0 days, <30 days, or > 30 days. Our results show that using 23 medication features in learning experiments improves Recall of five out of the six applied learning algorithms. This is a new result that expands the previous studies conducted on the same data. Gradient Boosting and Random Forest outperformed other algorithms in terms of the three-class classification Accuracy.

Related papers

A Comprehensive Machine Learning Framework for Heart Disease Prediction: Performance Evaluation and Future Perspectives [0.0]
This study presents a machine learning-based framework for heart disease prediction using the heart-disease dataset.<n>The proposed model demonstrates strong potential for aiding clinical decision-making by effectively predicting heart disease.
arXiv Detail & Related papers (2025-05-15T05:13:38Z)
Comparative Performance of Machine Learning Algorithms for Early Genetic Disorder and Subclass Classification [0.0]
Early diagnosis of genetic disorders enables timely interventions and improves outcomes. This study implements machine learning models using basic clinical indicators measurable at birth or infancy. Applying ML with basic clinical indicators can enable timely interventions once validated on larger datasets.
arXiv Detail & Related papers (2024-12-03T06:02:47Z)
Predictive Modeling for Breast Cancer Classification in the Context of Bangladeshi Patients: A Supervised Machine Learning Approach with Explainable AI [0.0]
We evaluate and compare the classification accuracy, precision, recall, and F-1 scores of five different machine learning methods. XGBoost achieved the best model accuracy, which is 97%.
arXiv Detail & Related papers (2024-04-06T17:23:21Z)
Comparison of Machine Learning Classification Algorithms and Application to the Framingham Heart Study [0.0]
The use of machine learning algorithms in healthcare can amplify social injustices and health inequities. This research pertains to some generalizability impediments that occur during the development and the post-deployment of machine learning classification algorithms.
arXiv Detail & Related papers (2024-02-22T22:49:35Z)
The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation. We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare. Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z)
Does Deep Learning REALLY Outperform Non-deep Machine Learning for Clinical Prediction on Physiological Time Series? [11.901347806586234]
We systematically examine the performance of machine learning models for the clinical prediction task based on the EHR. Ten baseline machine learning models are compared, including 3 deep learning methods and 7 non-deep learning methods. The results show that deep learning indeed outperforms non-deep learning, but with certain conditions.
arXiv Detail & Related papers (2022-11-11T07:09:49Z)
Analyzing Wearables Dataset to Predict ADLs and Falls: A Pilot Study [0.0]
This paper exhaustively reviews thirty-nine wearable based datasets which can be used for evaluating the system to recognize Activities of Daily Living and Falls. A comparative analysis on the SisFall dataset using five machine learning methods is performed in python. The results obtained from this study proves that KNN outperforms other machine learning methods in terms of accuracy, precision and recall.
arXiv Detail & Related papers (2022-09-11T04:41:40Z)
Continual Learning with Bayesian Model based on a Fixed Pre-trained Feature Extractor [55.9023096444383]
Current deep learning models are characterised by catastrophic forgetting of old knowledge when learning new classes. Inspired by the process of learning new knowledge in human brains, we propose a Bayesian generative model for continual learning.
arXiv Detail & Related papers (2022-04-28T08:41:51Z)
LifeLonger: A Benchmark for Continual Disease Classification [59.13735398630546]
We introduce LifeLonger, a benchmark for continual disease classification on the MedMNIST collection. Task and class incremental learning of diseases address the issue of classifying new samples without re-training the models from scratch. Cross-domain incremental learning addresses the issue of dealing with datasets originating from different institutions while retaining the previously obtained knowledge.
arXiv Detail & Related papers (2022-04-12T12:25:05Z)
The Severity Prediction of The Binary And Multi-Class Cardiovascular Disease -- A Machine Learning-Based Fusion Approach [0.0]
Recently CVDs, or cardiovascular disease, have become a leading cause of death around the world. In this research, some fusion models have been constructed to diagnose CVDs along with its severity. The highest accuracy for multiclass classification was found as 75%, and it was 95% for binary.
arXiv Detail & Related papers (2022-03-09T18:06:24Z)
Relational Subsets Knowledge Distillation for Long-tailed Retinal Diseases Recognition [65.77962788209103]
We propose class subset learning by dividing the long-tailed data into multiple class subsets according to prior knowledge. It enforces the model to focus on learning the subset-specific knowledge. The proposed framework proved to be effective for the long-tailed retinal diseases recognition task.
arXiv Detail & Related papers (2021-04-22T13:39:33Z)
Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients. We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks. Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
Self-Training with Improved Regularization for Sample-Efficient Chest X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios. Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.