Related papers: Classifying Dental Care Providers Through Machine Learning with Features Ranking

Classifying Dental Care Providers Through Machine Learning with Features Ranking

URL: http://arxiv.org/abs/2506.04474v1
Date: Wed, 04 Jun 2025 21:45:40 GMT
Title: Classifying Dental Care Providers Through Machine Learning with Features Ranking
Authors: Mohammad Subhi Al-Batah, Mowafaq Salem Alzboon, Muhyeeddin Alqaraleh, Mohammed Hasan Abu-Arqoub, Rashiq Rafiq Marie,
Abstract summary: This study investigates the application of machine learning (ML) models for classifying dental providers.<n>The dataset includes service counts (preventive, treatment, exams), delivery systems (FFS, managed care), and beneficiary demographics.<n>The study underscores the importance of feature selection in enhancing model efficiency and accuracy.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This study investigates the application of machine learning (ML) models for classifying dental providers into two categories - standard rendering providers and safety net clinic (SNC) providers - using a 2018 dataset of 24,300 instances with 20 features. The dataset, characterized by high missing values (38.1%), includes service counts (preventive, treatment, exams), delivery systems (FFS, managed care), and beneficiary demographics. Feature ranking methods such as information gain, Gini index, and ANOVA were employed to identify critical predictors, revealing treatment-related metrics (TXMT_USER_CNT, TXMT_SVC_CNT) as top-ranked features. Twelve ML models, including k-Nearest Neighbors (kNN), Decision Trees, Support Vector Machines (SVM), Stochastic Gradient Descent (SGD), Random Forest, Neural Networks, and Gradient Boosting, were evaluated using 10-fold cross-validation. Classification accuracy was tested across incremental feature subsets derived from rankings. The Neural Network achieved the highest accuracy (94.1%) using all 20 features, followed by Gradient Boosting (93.2%) and Random Forest (93.0%). Models showed improved performance as more features were incorporated, with SGD and ensemble methods demonstrating robustness to missing data. Feature ranking highlighted the dominance of treatment service counts and annotation codes in distinguishing provider types, while demographic variables (AGE_GROUP, CALENDAR_YEAR) had minimal impact. The study underscores the importance of feature selection in enhancing model efficiency and accuracy, particularly in imbalanced healthcare datasets. These findings advocate for integrating feature-ranking techniques with advanced ML algorithms to optimize dental provider classification, enabling targeted resource allocation for underserved populations.

Related papers

Evaluating Ensemble and Deep Learning Models for Static Malware Detection with Dimensionality Reduction Using the EMBER Dataset [0.0]
This study investigates the effectiveness of several machine learning algorithms for static malware detection using the EMBER dataset.<n>We evaluate eight classification models: LightGBM, XGBoost, CatBoost, Random Forest, Extra Trees, HistGradientBoosting, k-Nearest Neighbors (KNN), and TabNet.<n>The models are assessed on accuracy, precision, recall, F1 score, and AUC to examine both predictive performance and robustness.
arXiv Detail & Related papers (2025-07-22T18:45:10Z)
Comparative performance of ensemble models in predicting dental provider types: insights from fee-for-service data [0.44998333629984877]
This study aimed to evaluate the performance of machine learning models in classifying dental providers using a 2018 dataset.<n>Nerve Networks achieved the highest AUC (0.975) and CA (94.1%), followed by Random Forest (AUC: 0.948, CA: 93.0%)<n>Advanced machine learning techniques, particularly ensemble and deep learning models, significantly enhance dental workforce classification.
arXiv Detail & Related papers (2025-06-04T21:55:27Z)
A Comprehensive Machine Learning Framework for Heart Disease Prediction: Performance Evaluation and Future Perspectives [0.0]
This study presents a machine learning-based framework for heart disease prediction using the heart-disease dataset.<n>The proposed model demonstrates strong potential for aiding clinical decision-making by effectively predicting heart disease.
arXiv Detail & Related papers (2025-05-15T05:13:38Z)
Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference Feature [1.7779568951268254]
We introduce a novel methodology for voice pathology detection using the publicly available Saarbr"ucken Voice Database.<n>We evaluate six machine learning (ML) algorithms -- support vector machine, k-nearest neighbors, naive Bayes, decision tree, random forest, and AdaBoost.<n>Our approach 85.61%, 84.69% and 85.22% unweighted average recall (UAR) for females, males and combined results respectively.
arXiv Detail & Related papers (2024-10-14T14:17:52Z)
Prospector Heads: Generalized Feature Attribution for Large Models & Data [82.02696069543454]
We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods. We demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data.
arXiv Detail & Related papers (2024-02-18T23:01:28Z)
Evaluating the Fairness of the MIMIC-IV Dataset and a Baseline Algorithm: Application to the ICU Length of Stay Prediction [65.268245109828]
This paper uses the MIMIC-IV dataset to examine the fairness and bias in an XGBoost binary classification model predicting the ICU length of stay. The research reveals class imbalances in the dataset across demographic attributes and employs data preprocessing and feature extraction. The paper concludes with recommendations for fairness-aware machine learning techniques for mitigating biases and the need for collaborative efforts among healthcare professionals and data scientists.
arXiv Detail & Related papers (2023-12-31T16:01:48Z)
An Evaluation of Machine Learning Approaches for Early Diagnosis of Autism Spectrum Disorder [0.0]
Autistic Spectrum Disorder (ASD) is a neurological disease characterized by difficulties with social interaction, communication, and repetitive activities. This study employs diverse machine learning methods to identify crucial ASD traits, aiming to enhance and automate the diagnostic process.
arXiv Detail & Related papers (2023-09-20T21:23:37Z)
SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption [72.35532598131176]
We propose SCARF, a technique for contrastive learning, where views are formed by corrupting a random subset of features. We show that SCARF complements existing strategies and outperforms alternatives like autoencoders.
arXiv Detail & Related papers (2021-06-29T08:08:33Z)
No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data. We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model. Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z)
Cervical Cytology Classification Using PCA & GWO Enhanced Deep Features Selection [1.990876596716716]
Cervical cancer is one of the most deadly and common diseases among women worldwide. We propose a fully automated framework that utilizes Deep Learning and feature selection. The framework is evaluated on three publicly available benchmark datasets.
arXiv Detail & Related papers (2021-06-09T08:57:22Z)
Explaining COVID-19 and Thoracic Pathology Model Predictions by Identifying Informative Input Features [47.45835732009979]
Neural networks have demonstrated remarkable performance in classification and regression tasks on chest X-rays. Features attribution methods identify the importance of input features for the output prediction. We evaluate our methods using both human-centric (ground-truth-based) interpretability metrics, and human-independent feature importance metrics on NIH Chest X-ray8 and BrixIA datasets.
arXiv Detail & Related papers (2021-04-01T11:42:39Z)
Adversarial Feature Augmentation and Normalization for Visual Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models. Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings. We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.