Community-Based Early-Stage Chronic Kidney Disease Screening using Explainable Machine Learning for Low-Resource Settings
- URL: http://arxiv.org/abs/2601.01119v1
- Date: Sat, 03 Jan 2026 08:43:35 GMT
- Title: Community-Based Early-Stage Chronic Kidney Disease Screening using Explainable Machine Learning for Low-Resource Settings
- Authors: Muhammad Ashad Kabir, Sirajam Munira, Dewan Tasnia Azad, Saleh Mohammed Ikram, Mohammad Habibur Rahman Sarker, Syed Manzoor Ahmed Hanifi,
- Abstract summary: Existing screening tools often underperform in Bangladesh and South Asia, where risk profiles differ.<n>Our objective was to develop and evaluate an explainable machine learning framework for community-based early-stage CKD screening.
- Score: 0.3899675786340555
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Early detection of chronic kidney disease (CKD) is essential for preventing progression to end-stage renal disease. However, existing screening tools - primarily developed using populations from high-income countries - often underperform in Bangladesh and South Asia, where risk profiles differ. Most of these tools rely on simple additive scoring functions and are based on data from patients with advanced-stage CKD. Consequently, they fail to capture complex interactions among risk factors and are limited in predicting early-stage CKD. Our objective was to develop and evaluate an explainable machine learning (ML) framework for community-based early-stage CKD screening for low-resource settings, tailored to the Bangladeshi and South Asian population context. We used a community-based dataset from Bangladesh, the first such CKD dataset in South and South Asia, and evaluated twelve ML classifiers across multiple feature domains. Ten complementary feature selection techniques were applied to identify robust, generalizable predictors. The final models were assessed using 10-fold cross-validation. External validation was conducted on three independent datasets from India, the UAE, and Bangladesh. SHAP (SHapley Additive exPlanations) was used to provide model explainability. An ML model trained on an RFECV-selected feature subset achieved a balanced accuracy of 90.40%, whereas minimal non-pathology-test features demonstrated excellent predictive capability with a balanced accuracy of 89.23%, often outperforming larger or full feature sets. Compared with existing screening tools, the proposed models achieved substantially higher accuracy and sensitivity while requiring fewer and more accessible inputs. External validation confirmed strong generalizability with 78% to 98% sensitivity. SHAP interpretation identified clinically meaningful predictors consistent with established CKD risk factors.
Related papers
- Explainable Admission-Level Predictive Modeling for Prolonged Hospital Stay in Elderly Populations: Challenges in Low- and Middle-Income Countries [65.4286079244589]
Prolonged length of stay (pLoS) is a significant factor associated with the risk of adverse in-hospital events.<n>We develop and explain a predictive model for pLos using admission-level patient and hospital administrative data.
arXiv Detail & Related papers (2026-01-07T23:35:24Z) - Chronic Kidney Disease Prognosis Prediction Using Transformer [2.054117570146147]
Chronic Kidney Disease (CKD) affects nearly 10% of the global population and often progresses to end-stage renal failure.<n>We present a transformer-based framework for predicting CKD progression using multi-modal electronic health records.
arXiv Detail & Related papers (2025-11-04T07:52:17Z) - Performance Analysis of Machine Learning Algorithms in Chronic Kidney Disease Prediction [2.5180274967765643]
About 10% of the global population is thought to be affected by Chronic Kidney Disease (CKD), which causes kidney function to decline.<n>In this study, we designed and suggested disease predictive computer-aided designs for the diagnosis of CKD.
arXiv Detail & Related papers (2025-10-10T15:54:37Z) - A Disease-Centric Vision-Language Foundation Model for Precision Oncology in Kidney Cancer [54.58205672910646]
RenalCLIP is a visual-language foundation model for characterization, diagnosis and prognosis of renal mass.<n>It achieved better performance and superior generalizability across 10 core tasks spanning the full clinical workflow of kidney cancer.
arXiv Detail & Related papers (2025-08-22T17:48:19Z) - Early Mortality Prediction in ICU Patients with Hypertensive Kidney Disease Using Interpretable Machine Learning [3.4335475695580127]
Hypertensive kidney disease (HKD) patients in intensive care units (ICUs) face high short-term mortality.<n>We developed a machine learning framework to predict 30-day in-hospital mortality among ICU patients with HKD.
arXiv Detail & Related papers (2025-07-25T00:48:23Z) - A Copula Based Supervised Filter for Feature Selection in Diabetes Risk Prediction Using Machine Learning [0.0]
We propose a computationally efficient supervised filter that ranks features using the Gumbel copula upper tail dependence coefficient ($lambda_U$)<n>We benchmarked against Mutual Information, mRMR, ReliefF, and $L_1$ Elastic Net across four classifiers on two diabetes datasets.<n>We conclude that copula based feature selection via upper tail dependence is a powerful, efficient, and interpretable approach for building risk models in public health and clinical medicine.
arXiv Detail & Related papers (2025-05-28T16:34:58Z) - Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Explainable Machine Learning System for Predicting Chronic Kidney Disease in High-Risk Cardiovascular Patients [0.0]
This research developed an explainable machine learning system for predicting Chronic Kidney Disease (CKD) in patients with cardiovascular risks.
The Random Forest model achieved the highest sensitivity of 88.2%.
arXiv Detail & Related papers (2024-04-17T07:59:33Z) - Clinical Deterioration Prediction in Brazilian Hospitals Based on
Artificial Neural Networks and Tree Decision Models [56.93322937189087]
An extremely boosted neural network (XBNet) is used to predict clinical deterioration (CD)
The XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.
arXiv Detail & Related papers (2022-12-17T23:29:14Z) - Detecting Chronic Kidney Disease(CKD) at the Initial Stage: A Novel
Hybrid Feature-selection Method and Robust Data Preparation Pipeline for
Different ML Techniques [0.0]
Chronic Kidney Disease (CKD) has infected almost 800 million people around the world. Around 1.7 million people die each year because of it.
Many researchers have applied distinct Machine Learning (ML) methods to detect CKD at an early stage, but detailed studies are still missing.
We present a structured and thorough method for dealing with the complexities of medical data with optimal performance.
arXiv Detail & Related papers (2022-03-02T20:38:49Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.