Comparative performance of ensemble models in predicting dental provider types: insights from fee-for-service data
- URL: http://arxiv.org/abs/2506.04479v1
- Date: Wed, 04 Jun 2025 21:55:27 GMT
- Title: Comparative performance of ensemble models in predicting dental provider types: insights from fee-for-service data
- Authors: Mohammad Subhi Al-Batah, Muhyeeddin Alqaraleh, Mowafaq Salem Alzboon, Abdullah Alourani,
- Abstract summary: This study aimed to evaluate the performance of machine learning models in classifying dental providers using a 2018 dataset.<n>Nerve Networks achieved the highest AUC (0.975) and CA (94.1%), followed by Random Forest (AUC: 0.948, CA: 93.0%)<n>Advanced machine learning techniques, particularly ensemble and deep learning models, significantly enhance dental workforce classification.
- Score: 0.44998333629984877
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dental provider classification plays a crucial role in optimizing healthcare resource allocation and policy planning. Effective categorization of providers, such as standard rendering providers and safety net clinic (SNC) providers, enhances service delivery to underserved populations. This study aimed to evaluate the performance of machine learning models in classifying dental providers using a 2018 dataset. A dataset of 24,300 instances with 20 features was analyzed, including beneficiary and service counts across fee-for-service (FFS), Geographic Managed Care, and Pre-Paid Health Plans. Providers were categorized by delivery system and patient age groups (0-20 and 21+). Despite 38.1% missing data, multiple machine learning algorithms were tested, including k-Nearest Neighbors (kNN), Decision Trees, Support Vector Machines (SVM), Stochastic Gradient Descent (SGD), Random Forest, Neural Networks, and Gradient Boosting. A 10-fold cross-validation approach was applied, and models were evaluated using AUC, classification accuracy (CA), F1-score, precision, and recall. Neural Networks achieved the highest AUC (0.975) and CA (94.1%), followed by Random Forest (AUC: 0.948, CA: 93.0%). These models effectively handled imbalanced data and complex feature interactions, outperforming traditional classifiers like Logistic Regression and SVM. Advanced machine learning techniques, particularly ensemble and deep learning models, significantly enhance dental workforce classification. Their integration into healthcare analytics can improve provider identification and resource distribution, benefiting underserved populations.
Related papers
- Classifying Dental Care Providers Through Machine Learning with Features Ranking [0.0]
This study investigates the application of machine learning (ML) models for classifying dental providers.<n>The dataset includes service counts (preventive, treatment, exams), delivery systems (FFS, managed care), and beneficiary demographics.<n>The study underscores the importance of feature selection in enhancing model efficiency and accuracy.
arXiv Detail & Related papers (2025-06-04T21:45:40Z) - A Comprehensive Machine Learning Framework for Heart Disease Prediction: Performance Evaluation and Future Perspectives [0.0]
This study presents a machine learning-based framework for heart disease prediction using the heart-disease dataset.<n>The proposed model demonstrates strong potential for aiding clinical decision-making by effectively predicting heart disease.
arXiv Detail & Related papers (2025-05-15T05:13:38Z) - Refining Tuberculosis Detection in CXR Imaging: Addressing Bias in Deep Neural Networks via Interpretability [1.9936075659851882]
We argue that the reliability of deep learning models is limited, even if they can be shown to obtain perfect classification accuracy on the test data.
We show that pre-training a deep neural network on a large-scale proxy task, as well as using mixed objective optimization network (MOON), can improve the alignment of decision foundations between models and experts.
arXiv Detail & Related papers (2024-07-19T06:41:31Z) - Evaluating the Fairness of the MIMIC-IV Dataset and a Baseline
Algorithm: Application to the ICU Length of Stay Prediction [65.268245109828]
This paper uses the MIMIC-IV dataset to examine the fairness and bias in an XGBoost binary classification model predicting the ICU length of stay.
The research reveals class imbalances in the dataset across demographic attributes and employs data preprocessing and feature extraction.
The paper concludes with recommendations for fairness-aware machine learning techniques for mitigating biases and the need for collaborative efforts among healthcare professionals and data scientists.
arXiv Detail & Related papers (2023-12-31T16:01:48Z) - Efficient Representation Learning for Healthcare with
Cross-Architectural Self-Supervision [5.439020425819001]
We present Cross Architectural - Self Supervision (CASS) in response to this challenge.
We show that CASS-trained CNNs and Transformers outperform existing self-supervised learning methods across four diverse healthcare datasets.
We also demonstrate that CASS is considerably more robust to variations in batch size and pretraining epochs, making it a suitable candidate for machine learning in healthcare applications.
arXiv Detail & Related papers (2023-08-19T15:57:19Z) - Clinical Deterioration Prediction in Brazilian Hospitals Based on
Artificial Neural Networks and Tree Decision Models [56.93322937189087]
An extremely boosted neural network (XBNet) is used to predict clinical deterioration (CD)
The XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.
arXiv Detail & Related papers (2022-12-17T23:29:14Z) - SANSformers: Self-Supervised Forecasting in Electronic Health Records
with Attention-Free Models [48.07469930813923]
This work aims to forecast the demand for healthcare services, by predicting the number of patient visits to healthcare facilities.
We introduce SANSformer, an attention-free sequential model designed with specific inductive biases to cater for the unique characteristics of EHR data.
Our results illuminate the promising potential of tailored attention-free models and self-supervised pretraining in refining healthcare utilization predictions across various patient demographics.
arXiv Detail & Related papers (2021-08-31T08:23:56Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Collaborative Federated Learning For Healthcare: Multi-Modal COVID-19
Diagnosis at the Edge [5.258947981618588]
We leverage the capabilities of edge computing in medicine by analyzing and evaluating the potential of intelligent processing of clinical visual data at the edge.
To this aim, we utilize the emerging concept of clustered federated learning (CFL) for an automatic diagnosis of COVID-19.
We evaluate the performance of the proposed framework under different experimental setups on two benchmark datasets.
arXiv Detail & Related papers (2021-01-19T08:40:59Z) - Predictive Modeling of ICU Healthcare-Associated Infections from
Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling
Approach [55.41644538483948]
This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units.
The aim is to support decision making addressed at reducing the incidence rate of infections.
arXiv Detail & Related papers (2020-05-07T16:13:12Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.