Related papers: Comparison of marker-less 2D image-based methods for infant pose estimation

Comparison of marker-less 2D image-based methods for infant pose estimation

URL: http://arxiv.org/abs/2410.04980v2
Date: Tue, 26 Nov 2024 11:59:22 GMT
Title: Comparison of marker-less 2D image-based methods for infant pose estimation
Authors: Lennart Jahn, Sarah Flügge, Dajie Zhang, Luise Poustka, Sven Bölte, Florentin Wörgötter, Peter B Marschik, Tomas Kulvicius,
Abstract summary: The best performing generic model trained on adults, ViTPose, also performs best on infants. The pose estimation accuracy obtained from the top-down view is significantly better than that obtained from the diagonal view.
Score: 2.7726930707973048
License:
Abstract: In this study we compare the performance of available generic- and infant-pose estimators for a video-based automated general movement assessment (GMA), and the choice of viewing angle for optimal recordings, i.e., conventional diagonal view used in GMA vs. top-down view. We used 4500 annotated video-frames from 75 recordings of infant spontaneous motor functions from 4 to 26 weeks. To determine which pose estimation method and camera angle yield the best pose estimation accuracy on infants in a GMA related setting, the distance to human annotations and the percentage of correct key-points (PCK) were computed and compared. The results show that the best performing generic model trained on adults, ViTPose, also performs best on infants. We see no improvement from using infant-pose estimators over the generic pose estimators on our infant dataset. However, when retraining a generic model on our data, there is a significant improvement in pose estimation accuracy. The pose estimation accuracy obtained from the top-down view is significantly better than that obtained from the diagonal view, especially for the detection of the hip key-points. The results also indicate limited generalization capabilities of infant-pose estimators to other infant datasets, which hints that one should be careful when choosing infant pose estimators and using them on infant datasets which they were not trained on. While the standard GMA method uses a diagonal view for assessment, pose estimation accuracy significantly improves using a top-down view. This suggests that a top-down view should be included in recording setups for automated GMA research.

Related papers

Advancing Newborn Care: Precise Birth Time Detection Using AI-Driven Thermal Imaging with Adaptive Normalization [1.101731711817642]
We investigate the fusion of Artificial Intelligence (AI) and thermal imaging to develop the first AI-driven Time of Birth detector. Our methodology involves a three-step process: first, we propose an adaptive normalization method based on Gaussian mixture models (GMM) to mitigate issues related to temperature variations. A precision of 88.1% and a recall of 89.3% are reported in the detection of the newborn within thermal frames during performance evaluation.
arXiv Detail & Related papers (2024-10-14T13:20:51Z)
SRPose: Two-view Relative Pose Estimation with Sparse Keypoints [51.49105161103385]
SRPose is a sparse keypoint-based framework for two-view relative pose estimation in camera-to-world and object-to-camera scenarios. It achieves competitive or superior performance compared to state-of-the-art methods in terms of accuracy and speed. It is robust to different image sizes and camera intrinsics, and can be deployed with low computing resources.
arXiv Detail & Related papers (2024-07-11T05:46:35Z)
Opinion-Unaware Blind Image Quality Assessment using Multi-Scale Deep Feature Statistics [54.08757792080732]
We propose integrating deep features from pre-trained visual models with a statistical analysis model to achieve opinion-unaware BIQA (OU-BIQA) Our proposed model exhibits superior consistency with human visual perception compared to state-of-the-art BIQA models.
arXiv Detail & Related papers (2024-05-29T06:09:34Z)
Localizing Scan Targets from Human Pose for Autonomous Lung Ultrasound Imaging [61.60067283680348]
With the advent of COVID-19 global pandemic, there is a need to fully automate ultrasound imaging. We propose a vision-based, data driven method that incorporates learning-based computer vision techniques. Our method attains an accuracy level of 15.52 (9.47) mm for probe positioning and 4.32 (3.69)deg for probe orientation, with a success rate above 80% under an error threshold of 25mm for all scan targets.
arXiv Detail & Related papers (2022-12-15T14:34:12Z)
Bottom-Up 2D Pose Estimation via Dual Anatomical Centers for Small-Scale Persons [75.86463396561744]
In multi-person 2D pose estimation, the bottom-up methods simultaneously predict poses for all persons. Our method achieves 38.4% improvement on bounding box precision and 39.1% improvement on bounding box recall over the state of the art (SOTA) For the human pose AP evaluation, we achieve a new SOTA (71.0 AP) on the COCO test-dev set with the single-scale testing.
arXiv Detail & Related papers (2022-08-25T10:09:10Z)
AggPose: Deep Aggregation Vision Transformer for Infant Pose Estimation [6.9000851935487075]
We propose infant pose dataset and Deep Aggregation Vision Transformer for human pose estimation. AggPose is a fast trained full transformer framework without using convolution operations to extract features in the early stages. We show that AggPose could effectively learn the multi-scale features among different resolutions and significantly improve the performance of infant pose estimation.
arXiv Detail & Related papers (2022-05-11T05:34:14Z)
Enabling faster and more reliable sonographic assessment of gestational age through machine learning [1.3238745915345225]
Fetal ultrasounds are an essential part of prenatal care and can be used to estimate gestational age (GA) We developed three AI models: an image model using standard plane images, a video model using fly-to videos, and an ensemble model (combining both image and video) All three were statistically superior to standard fetal biometry-based GA estimates derived by expert sonographers.
arXiv Detail & Related papers (2022-03-22T17:15:56Z)
Invariant Representation Learning for Infant Pose Estimation with Small Data [14.91506452479778]
We release a hybrid synthetic and real infant pose dataset with small yet diverse real images as well as generated synthetic infant poses. In our ablation study, with identical network structure, models trained on SyRIP dataset show noticeable improvement over the ones trained on the only other public infant pose datasets. One of our best infant pose estimation performers on the state-of-the-art DarkPose model shows mean average precision (mAP) of 93.6.
arXiv Detail & Related papers (2020-10-13T01:10:14Z)
Towards End-to-end Video-based Eye-Tracking [50.0630362419371]
Estimating eye-gaze from images alone is a challenging task due to un-observable person-specific factors. We propose a novel dataset and accompanying method which aims to explicitly learn these semantic and temporal relationships. We demonstrate that the fusion of information from visual stimuli as well as eye images can lead towards achieving performance similar to literature-reported figures.
arXiv Detail & Related papers (2020-07-26T12:39:15Z)
Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive Keypoint Estimates [76.51095823248104]
We present several schemes that are rarely or unthoroughly studied before for improving keypoint detection and grouping (keypoint regression) performance. First, we exploit the keypoint heatmaps for pixel-wise keypoint regression instead of separating them for improving keypoint regression. Second, we adopt a pixel-wise spatial transformer network to learn adaptive representations for handling the scale and orientation variance. Third, we present a joint shape and heatvalue scoring scheme to promote the estimated poses that are more likely to be true poses.
arXiv Detail & Related papers (2020-06-28T01:14:59Z)
Preterm infants' pose estimation with spatio-temporal features [7.054093620465401]
This paper introduces the use of preterm-temporal features for limb detection and tracking. It is the first study to use depth videos acquired in the actual clinical practice for limb-pose estimation.
arXiv Detail & Related papers (2020-05-08T09:51:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.