Multilevel Determinants of Overweight and Obesity Among U.S. Children Aged 10-17: Comparative Evaluation of Statistical and Machine Learning Approaches Using the 2021 National Survey of Children's Health
- URL: http://arxiv.org/abs/2602.20303v1
- Date: Mon, 23 Feb 2026 19:31:44 GMT
- Title: Multilevel Determinants of Overweight and Obesity Among U.S. Children Aged 10-17: Comparative Evaluation of Statistical and Machine Learning Approaches Using the 2021 National Survey of Children's Health
- Authors: Joyanta Jyoti Mondal,
- Abstract summary: We analyze 18,792 children aged 10-17 years from the National Survey of Children's Health.<n>Overweight/obesity is defined using BMI categories. Predictors included diet, physical activity, sleep, parental stress, socioeconomic conditions, adverse experiences, and neighborhood characteristics.<n>Performance is evaluated using AUC, accuracy, precision, recall, F1 score, and Brier score.
- Score: 0.2538209532048867
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Background: Childhood and adolescent overweight and obesity remain major public health concerns in the United States and are shaped by behavioral, household, and community factors. Their joint predictive structure at the population level remains incompletely characterized. Objectives: The study aims to identify multilevel predictors of overweight and obesity among U.S. adolescents and compare the predictive performance, calibration, and subgroup equity of statistical, machine-learning, and deep-learning models. Data and Methods: We analyze 18,792 children aged 10-17 years from the 2021 National Survey of Children's Health. Overweight/obesity is defined using BMI categories. Predictors included diet, physical activity, sleep, parental stress, socioeconomic conditions, adverse experiences, and neighborhood characteristics. Models include logistic regression, random forest, gradient boosting, XGBoost, LightGBM, multilayer perceptron, and TabNet. Performance is evaluated using AUC, accuracy, precision, recall, F1 score, and Brier score. Results: Discrimination range from 0.66 to 0.79. Logistic regression, gradient boosting, and MLP showed the most stable balance of discrimination and calibration. Boosting and deep learning modestly improve recall and F1 score. No model was uniformly superior. Performance disparities across race and poverty groups persist across algorithms. Conclusion: Increased model complexity yields limited gains over logistic regression. Predictors consistently span behavioral, household, and neighborhood domains. Persistent subgroup disparities indicate the need for improved data quality and equity-focused surveillance rather than greater algorithmic complexity.
Related papers
- Deep learning outperforms traditional machine learning methods in predicting childhood malnutrition: evidence from survey data [2.3951444869691594]
This study provides the first comprehensive assessment of machine learning and deep learning methodologies for identifying malnutrition.<n>Maternal education, household wealth index, and child age as the primary predictors of malnutrition, followed by geographic characteristics.<n>The proposed approach supports Nepal's progress toward the Sustainable Development Goals and offers a transferable methodological template for similar low-resource settings globally.
arXiv Detail & Related papers (2026-02-11T00:04:22Z) - Child Mortality Prediction in Bangladesh: A Decade-Long Validation Study [0.0]
The Demographic and Health Surveys (DHS) data from Bangladesh for 2011-2022, with n = 33,962, are used in this paper.<n>We trained the model on (2011-2014) data, validated it on 2017 data, and tested it on 2022 data.<n>Eight years after the initial test of the model, a genetic algorithm-based Neural Architecture Search found a single-layer neural architecture to be superior to XGBoost.
arXiv Detail & Related papers (2026-02-03T19:18:50Z) - Explainable Admission-Level Predictive Modeling for Prolonged Hospital Stay in Elderly Populations: Challenges in Low- and Middle-Income Countries [65.4286079244589]
Prolonged length of stay (pLoS) is a significant factor associated with the risk of adverse in-hospital events.<n>We develop and explain a predictive model for pLos using admission-level patient and hospital administrative data.
arXiv Detail & Related papers (2026-01-07T23:35:24Z) - MeCaMIL: Causality-Aware Multiple Instance Learning for Fair and Interpretable Whole Slide Image Diagnosis [40.3028468133626]
Multiple instance learning (MIL) has emerged as the dominant paradigm for whole slide image (WSI) analysis in computational pathology.<n>textbfMeCaMIL, a causality-aware MIL framework, explicitly models demographic confounders through structured causal graphs.<n>MeCaMIL achieves superior fairness -- demographic disparity variance drops by over 65% relative reduction on average across attributes.
arXiv Detail & Related papers (2025-11-14T06:47:21Z) - M-TabNet: A Multi-Encoder Transformer Model for Predicting Neonatal Birth Weight from Multimodal Data [3.452389713639621]
Birth weight (BW) is a key indicator of neonatal health, with low birth weight (LBW) linked to increased mortality and morbidity.<n>Existing models often neglect nutritional and genetic influences, focusing mainly on physiological and lifestyle factors.<n>This study presents an attention-based transformer model with a multi-encoder architecture for early (less than 12 weeks of gestation) BW prediction.
arXiv Detail & Related papers (2025-04-20T00:03:47Z) - Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Fairness Evolution in Continual Learning for Medical Imaging [47.52603262576663]
This study examines how bias evolves across tasks using domain-specific fairness metrics and how different CL strategies impact this evolution.<n>Our results show that Learning without Forgetting and Pseudo-Label achieve optimal classification performance, but Pseudo-Label is less biased.
arXiv Detail & Related papers (2024-04-10T09:48:52Z) - Uncertainty-guided Boundary Learning for Imbalanced Social Event
Detection [64.4350027428928]
We propose a novel uncertainty-guided class imbalance learning framework for imbalanced social event detection tasks.
Our model significantly improves social event representation and classification tasks in almost all classes, especially those uncertain ones.
arXiv Detail & Related papers (2023-10-30T03:32:04Z) - A Study of Age and Sex Bias in Multiple Instance Learning based
Classification of Acute Myeloid Leukemia Subtypes [44.077241051884926]
We train multiple MIL models using different levels of sex imbalance in the training set and excluding certain age groups.
We find a significant effect of sex and age bias on the performance of the model for AML subtype classification.
arXiv Detail & Related papers (2023-08-24T09:32:46Z) - Adapting Machine Learning Diagnostic Models to New Populations Using a Small Amount of Data: Results from Clinical Neuroscience [21.420302408947194]
We develop a weighted empirical risk minimization approach that optimally combines data from a source group to make predictions on a target group.
We apply this method to multi-source data of 15,363 individuals from 20 neuroimaging studies to build ML models for diagnosis of Alzheimer's disease and estimation of brain age.
arXiv Detail & Related papers (2023-08-06T18:05:39Z) - Who will Leave a Pediatric Weight Management Program and When? -- A
machine learning approach for predicting attrition patterns [1.0705399532413615]
Multidisciplinary pediatric weight management programs are considered standard treatment for children with obesity and severe obesity.
High drop-out rates (referred to as attrition) are a major hurdle in delivering successful interventions.
We present a machine learning model to predict (a) the likelihood of attrition, and (b) the change in body-mass index (BMI) percentile of children, at different time points after joining a weight management program.
arXiv Detail & Related papers (2022-02-03T18:41:36Z) - Fair and accurate age prediction using distribution aware data curation
and augmentation [42.98202989683421]
Age prediction is an especially difficult application with the issue of fairness remaining an open research problem.
One of the main causes of unfair behavior in age prediction methods lies in the distribution and diversity of the training data.
We present two novel approaches for dataset curation and data augmentation in order to increase fairness.
arXiv Detail & Related papers (2020-09-11T08:32:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.