NurtureNet: A Multi-task Video-based Approach for Newborn Anthropometry
- URL: http://arxiv.org/abs/2405.05530v1
- Date: Thu, 9 May 2024 03:49:54 GMT
- Title: NurtureNet: A Multi-task Video-based Approach for Newborn Anthropometry
- Authors: Yash Khandelwal, Mayur Arvind, Sriram Kumar, Ashish Gupta, Sachin Kumar Danisetty, Piyush Bagad, Anish Madan, Mayank Lunayach, Aditya Annavajjala, Abhishek Maiti, Sansiddh Jain, Aman Dalmia, Namrata Deka, Jerome White, Jigar Doshi, Angjoo Kanazawa, Rahul Panicker, Alpan Raval, Srinivas Rana, Makarand Tapaswi,
- Abstract summary: Malnutrition among newborns is a top public health concern in developing countries.
Our goal is to equip health workers and public health systems with a solution for contactless newborn anthropometry in the community.
- Score: 32.07154968009373
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Malnutrition among newborns is a top public health concern in developing countries. Identification and subsequent growth monitoring are key to successful interventions. However, this is challenging in rural communities where health systems tend to be inaccessible and under-equipped, with poor adherence to protocol. Our goal is to equip health workers and public health systems with a solution for contactless newborn anthropometry in the community. We propose NurtureNet, a multi-task model that fuses visual information (a video taken with a low-cost smartphone) with tabular inputs to regress multiple anthropometry estimates including weight, length, head circumference, and chest circumference. We show that visual proxy tasks of segmentation and keypoint prediction further improve performance. We establish the efficacy of the model through several experiments and achieve a relative error of 3.9% and mean absolute error of 114.3 g for weight estimation. Model compression to 15 MB also allows offline deployment to low-cost smartphones.
Related papers
- NutriScreener: Retrieval-Augmented Multi-Pose Graph Attention Network for Malnourishment Screening [33.31396710382974]
NutriScreener is a retrieval-augmented, multi-pose graph attention network.<n>It combines CLIP-based visual embeddings, class-boosted knowledge retrieval, and context awareness.<n>It achieves 0.79 recall, 0.82 AUC, and significantly lower anthropometric RMSEs.
arXiv Detail & Related papers (2025-11-20T17:20:42Z) - Vision-Based Embedded System for Noncontact Monitoring of Preterm Infant Behavior in Low-Resource Care Settings [0.0]
Preterm birth is a leading cause of neonatal mortality, disproportionately affecting low-resource settings with limited access to advanced neonatal intensive care units (NICUs)<n>This paper presents a novel, noninvasive, and automated vision-based framework to address this gap.<n>We introduce an embedded monitoring system that utilizes a quantized MobileNet model deployed on a Raspberry Pi for real-time behavioral state detection.
arXiv Detail & Related papers (2025-09-02T07:05:47Z) - MedGemma Technical Report [75.88152277443179]
We introduce MedGemma, a collection of medical vision-language foundation models based on Gemma 3 4B and 27B.<n>MedGemma demonstrates advanced medical understanding and reasoning on images and text.<n>We additionally introduce MedSigLIP, a medically-tuned vision encoder derived from SigLIP.
arXiv Detail & Related papers (2025-07-07T17:01:44Z) - An Interoperable Machine Learning Pipeline for Pediatric Obesity Risk Estimation [39.82363561134585]
No commonly used clinical decision support tool based on existing ML models currently exists.
This study presents a novel end-to-end pipeline specifically designed for pediatric obesity prediction.
Our pipeline supports the entire process of data extraction, inference, and communication via an API or a user interface.
arXiv Detail & Related papers (2024-12-12T07:25:37Z) - IRL for Restless Multi-Armed Bandits with Applications in Maternal and Child Health [52.79219652923714]
This paper is the first to present the use of inverse reinforcement learning (IRL) to learn desired rewards for RMABs.
We demonstrate improved outcomes in a maternal and child health telehealth program.
arXiv Detail & Related papers (2024-12-11T15:28:04Z) - AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation [55.179287851188036]
We introduce a novel all-in-one-stage framework, AiOS, for expressive human pose and shape recovery without an additional human detection step.
We first employ a human token to probe a human location in the image and encode global features for each instance.
Then, we introduce a joint-related token to probe the human joint in the image and encoder a fine-grained local feature.
arXiv Detail & Related papers (2024-03-26T17:59:23Z) - Binarized 3D Whole-body Human Mesh Recovery [104.13364878565737]
We propose a Binarized Dual Residual Network (BiDRN) to estimate the 3D human body, face, and hands parameters efficiently.
BiDRN achieves comparable performance with full-precision method Hand4Whole while using just 22.1% parameters and 14.8% operations.
arXiv Detail & Related papers (2023-11-24T07:51:50Z) - Human Health Indicator Prediction from Gait Video [34.24448186464565]
We propose to employ gait videos to predict health indicators, which are more prevalent in surveillance and home monitoring scenarios.
To better suit the health indicator prediction task, we bring forward Global-Local Aware aNdsymmetric Centro (GLANCE) module.
Experiments demonstrate that the proposed paradigm achieves state-of-the-art results for predicting health indicators on MoVi.
arXiv Detail & Related papers (2022-12-25T19:10:37Z) - A Two-stream Convolutional Network for Musculoskeletal and Neurological
Disorders Prediction [14.003588854239544]
Musculoskeletal and neurological disorders are the most common causes of walking problems among older people.
Recent deep learning-based methods have shown promising results for automated analysis.
arXiv Detail & Related papers (2022-08-18T14:32:16Z) - Machine Learning-based Biological Ageing Estimation Technologies: A
Survey [2.9554549423413303]
We will mainly review three age prediction methods by using machine learning (ML)
They are based on blood biomarkers, facial images, and structural features.
The prediction accuracy is not very good, which cannot make a great contribution to the medical field.
arXiv Detail & Related papers (2022-06-25T13:38:39Z) - A Neural Anthropometer Learning from Body Dimensions Computed on Human
3D Meshes [0.0]
We present a method to calculate right and left arm length, shoulder width, and inseam (crotch height) from 3D meshes with focus on potential medical, virtual try-on and distance tailoring applications.
On the other hand, we use four additional body dimensions calculated using recently published methods to assemble a set of eight body dimensions which we use as a supervision signal to our Neural Anthropometer: a convolutional neural network capable of estimating these dimensions.
arXiv Detail & Related papers (2021-10-06T12:56:05Z) - Unsupervised Human Pose Estimation through Transforming Shape Templates [2.729524133721473]
We present a novel method for learning pose estimators for human adults and infants in an unsupervised fashion.
We demonstrate the effectiveness of our approach on two different datasets including adults and infants.
arXiv Detail & Related papers (2021-05-10T07:15:56Z) - 3D Human Body Reshaping with Anthropometric Modeling [59.51820187982793]
Reshaping accurate and realistic 3D human bodies from anthropometric parameters poses a fundamental challenge for person identification, online shopping and virtual reality.
Existing approaches for creating such 3D shapes often suffer from complex measurement by range cameras or high-end scanners.
This paper proposes a novel feature-selection-based local mapping technique, which enables automatic anthropometric parameter modeling for each body facet.
arXiv Detail & Related papers (2021-04-05T04:09:39Z) - Hybrid Attention for Automatic Segmentation of Whole Fetal Head in
Prenatal Ultrasound Volumes [52.53375964591765]
We propose the first fully-automated solution to segment the whole fetal head in US volumes.
The segmentation task is firstly formulated as an end-to-end volumetric mapping under an encoder-decoder deep architecture.
We then combine the segmentor with a proposed hybrid attention scheme (HAS) to select discriminative features and suppress the non-informative volumetric features.
arXiv Detail & Related papers (2020-04-28T14:43:05Z) - HEMlets PoSh: Learning Part-Centric Heatmap Triplets for 3D Human Pose
and Shape Estimation [60.35776484235304]
This work attempts to address the uncertainty of lifting the detected 2D joints to the 3D space by introducing an intermediate state-Part-Centric Heatmap Triplets (HEMlets)
The HEMlets utilize three joint-heatmaps to represent the relative depth information of the end-joints for each skeletal body part.
A Convolutional Network (ConvNet) is first trained to predict HEMlets from the input image, followed by a volumetric joint-heatmap regression.
arXiv Detail & Related papers (2020-03-10T04:03:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.