Predicting Anemia Among Under-Five Children in Nepal Using Machine Learning and Deep Learning
- URL: http://arxiv.org/abs/2602.01005v1
- Date: Sun, 01 Feb 2026 04:00:12 GMT
- Title: Predicting Anemia Among Under-Five Children in Nepal Using Machine Learning and Deep Learning
- Authors: Deepak Bastola, Pitambar Acharya, Dipak Dulal, Rabina Dhakal, Yang Li,
- Abstract summary: We analyzed Nepal Demographic and Health Survey (NDHS 2022) microdata comprising 1,855 children.<n>Child age, recent fever, household size, maternal anemia, and parasite deworming were consistently selected.<n>Machine learning and deep learning models can provide competitive anemia prediction.
- Score: 1.8183349008726755
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Childhood anemia remains a major public health challenge in Nepal and is associated with impaired growth, cognition, and increased morbidity. Using World Health Organization hemoglobin thresholds, we defined anemia status for children aged 6-59 months and formulated a binary classification task by grouping all anemia severities as \emph{anemic} versus \emph{not anemic}. We analyzed Nepal Demographic and Health Survey (NDHS 2022) microdata comprising 1,855 children and initially considered 48 candidate features spanning demographic, socioeconomic, maternal, and child health characteristics. To obtain a stable and substantiated feature set, we applied four features selection techniques (Chi-square, mutual information, point-biserial correlation, and Boruta) and prioritized features supported by multi-method consensus. Five features: child age, recent fever, household size, maternal anemia, and parasite deworming were consistently selected by all methods, while amenorrhea, ethnicity indicators, and provinces were frequently retained. We then compared eight traditional machine learning classifiers (LR, KNN, DT, RF, XGBoost, SVM, NB, LDA) with two deep learning models (DNN and TabNet) using standard evaluation metrics, emphasizing F1-score and recall due to class imbalance. Among all models, logistic regression attained the best recall (0.701) and the highest F1-score (0.649), while DNN achieved the highest accuracy (0.709), and SVM yielded the strongest discrimination with the highest AUC (0.736). Overall, the results indicate that both machine learning and deep learning models can provide competitive anemia prediction and the interpretable features such as child age, infection proxy, maternal anemia, and deworming history are central for risk stratification and public health screening in Nepal.
Related papers
- Deep learning outperforms traditional machine learning methods in predicting childhood malnutrition: evidence from survey data [2.3951444869691594]
This study provides the first comprehensive assessment of machine learning and deep learning methodologies for identifying malnutrition.<n>Maternal education, household wealth index, and child age as the primary predictors of malnutrition, followed by geographic characteristics.<n>The proposed approach supports Nepal's progress toward the Sustainable Development Goals and offers a transferable methodological template for similar low-resource settings globally.
arXiv Detail & Related papers (2026-02-11T00:04:22Z) - Child Mortality Prediction in Bangladesh: A Decade-Long Validation Study [0.0]
The Demographic and Health Surveys (DHS) data from Bangladesh for 2011-2022, with n = 33,962, are used in this paper.<n>We trained the model on (2011-2014) data, validated it on 2017 data, and tested it on 2022 data.<n>Eight years after the initial test of the model, a genetic algorithm-based Neural Architecture Search found a single-layer neural architecture to be superior to XGBoost.
arXiv Detail & Related papers (2026-02-03T19:18:50Z) - MeCaMIL: Causality-Aware Multiple Instance Learning for Fair and Interpretable Whole Slide Image Diagnosis [40.3028468133626]
Multiple instance learning (MIL) has emerged as the dominant paradigm for whole slide image (WSI) analysis in computational pathology.<n>textbfMeCaMIL, a causality-aware MIL framework, explicitly models demographic confounders through structured causal graphs.<n>MeCaMIL achieves superior fairness -- demographic disparity variance drops by over 65% relative reduction on average across attributes.
arXiv Detail & Related papers (2025-11-14T06:47:21Z) - Performance Analysis of Post-Training Quantization for CNN-based Conjunctival Pallor Anemia Detection [0.0]
Anemia is a widespread global health issue, particularly among young children in low-resource settings.<n>Traditional methods for anemia detection often require expensive equipment and expert knowledge.<n>To address these challenges, we explore the use of deep learning models for detecting anemia through conjunctival pallor.
arXiv Detail & Related papers (2025-07-20T23:02:58Z) - Predicting Length of Stay in Neurological ICU Patients Using Classical Machine Learning and Neural Network Models: A Benchmark Study on MIMIC-IV [49.1574468325115]
This study explores multiple ML approaches for predicting LOS in ICU specifically for the patients with neurological diseases based on the MIMIC-IV dataset.<n>The evaluated models include classic ML algorithms (K-Nearest Neighbors, Random Forest, XGBoost and CatBoost) and Neural Networks (LSTM, BERT and Temporal Fusion Transformer)
arXiv Detail & Related papers (2025-05-23T14:06:42Z) - Risk factor identification and classification of malnutrition among under-five children in Bangladesh: Machine learning and statistical approach [0.0]
This study aims to understand the factors that resulted in under-five children's malnutrition from the Multiple Indicator Cluster (MICS 2019) nationwide surveys.<n>It classify different malnutrition stages based on the four well-established machine learning algorithms, namely - Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), and Multi-layer Perceptron (MLP) neural network.<n>The statistical Pearson correlation coefficient analysis is also done to understand the significant factors related to a child's malnutrition.
arXiv Detail & Related papers (2024-12-08T04:50:23Z) - Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Classification Methods Based on Machine Learning for the Analysis of
Fetal Health Data [1.3597551064547502]
We have analyzed the classification performance of various machine learning models for fetal health analysis.
A TabNet model on a fetal health dataset provides a classification accuracy of 94.36%.
arXiv Detail & Related papers (2023-11-18T04:01:46Z) - A Study of Age and Sex Bias in Multiple Instance Learning based
Classification of Acute Myeloid Leukemia Subtypes [44.077241051884926]
We train multiple MIL models using different levels of sex imbalance in the training set and excluding certain age groups.
We find a significant effect of sex and age bias on the performance of the model for AML subtype classification.
arXiv Detail & Related papers (2023-08-24T09:32:46Z) - Age-Stratified Differences in Morphological Connectivity Patterns in
ASD: An sMRI and Machine Learning Approach [1.3436368800886478]
The objective of this study was to compare the effect of different age groups in classifying ASD using morphological features (MF) and morphological connectivity features (MCF)
The MCF with RF in the 6 to 11 age group performed better in the classification than the other groups and produced an accuracy, F1 score, recall, and precision of 75.8%, 83.1%, 86%, and 80.4%, respectively.
arXiv Detail & Related papers (2023-08-14T12:11:25Z) - Deep-Learning Tool for Early Identifying Non-Traumatic Intracranial
Hemorrhage Etiology based on CT Scan [40.51754649947294]
The deep learning model was developed with 1868 eligible NCCT scans with non-traumatic ICH collected between January 2011 and April 2018.
The model's diagnostic performance was compared with clinicians's performance.
The clinicians achieve significant improvements in the sensitivity, specificity, and accuracy of diagnoses of certain hemorrhage etiologies with proposed system augmentation.
arXiv Detail & Related papers (2023-02-02T08:45:17Z) - MedML: Fusing Medical Knowledge and Machine Learning Models for Early
Pediatric COVID-19 Hospitalization and Severity Prediction [27.352097332678213]
We respond to the national Pediatric COVID-19 data challenge with a novel machine learning model, MedML.
MedML extracts the most predictive features based on medical knowledge and propensity scores from over 6 million medical concepts.
We evaluate MedML across 143,605 patients for the hospitalization prediction task and 11,465 patients for the severity prediction task.
arXiv Detail & Related papers (2022-07-25T15:56:14Z) - 1-D Convlutional Neural Networks for the Analysis of Pupil Size
Variations in Scotopic Conditions [79.71065005161566]
1-D convolutional neural network models are trained for classification of short-range sequences.
Model provides prediction with high average accuracy on a hold out test set.
arXiv Detail & Related papers (2020-02-06T17:25:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.