Invariant Representation Learning for Infant Pose Estimation with Small
Data
- URL: http://arxiv.org/abs/2010.06100v5
- Date: Mon, 1 Nov 2021 17:22:05 GMT
- Title: Invariant Representation Learning for Infant Pose Estimation with Small
Data
- Authors: Xiaofei Huang, Nihang Fu, Shuangjun Liu, Sarah Ostadabbas
- Abstract summary: We release a hybrid synthetic and real infant pose dataset with small yet diverse real images as well as generated synthetic infant poses.
In our ablation study, with identical network structure, models trained on SyRIP dataset show noticeable improvement over the ones trained on the only other public infant pose datasets.
One of our best infant pose estimation performers on the state-of-the-art DarkPose model shows mean average precision (mAP) of 93.6.
- Score: 14.91506452479778
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Infant motion analysis is a topic with critical importance in early childhood
development studies. However, while the applications of human pose estimation
have become more and more broad, models trained on large-scale adult pose
datasets are barely successful in estimating infant poses due to the
significant differences in their body ratio and the versatility of their poses.
Moreover, the privacy and security considerations hinder the availability of
adequate infant pose data required for training of a robust model from scratch.
To address this problem, this paper presents (1) building and publicly
releasing a hybrid synthetic and real infant pose (SyRIP) dataset with small
yet diverse real infant images as well as generated synthetic infant poses and
(2) a multi-stage invariant representation learning strategy that could
transfer the knowledge from the adjacent domains of adult poses and synthetic
infant images into our fine-tuned domain-adapted infant pose (FiDIP) estimation
model. In our ablation study, with identical network structure, models trained
on SyRIP dataset show noticeable improvement over the ones trained on the only
other public infant pose datasets. Integrated with pose estimation backbone
networks with varying complexity, FiDIP performs consistently better than the
fine-tuned versions of those models. One of our best infant pose estimation
performers on the state-of-the-art DarkPose model shows mean average precision
(mAP) of 93.6.
Related papers
- Efficient Domain Adaptation via Generative Prior for 3D Infant Pose
Estimation [29.037799937729687]
3D human pose estimation has gained impressive development in recent years, but only a few works focus on infants, that have different bone lengths and also have limited data.
Here, we show that our model attains state-of-the-art MPJPE performance of 43.6 mm on the SyRIP dataset and 21.2 mm on the MINI-RGBD dataset.
We also prove that our method, ZeDO-i, could attain efficient domain adaptation, even if only a small number of data is given.
arXiv Detail & Related papers (2023-11-17T20:49:37Z) - Understanding Pose and Appearance Disentanglement in 3D Human Pose
Estimation [72.50214227616728]
Several methods have proposed to learn image representations in a self-supervised fashion so as to disentangle the appearance information from the pose one.
We study disentanglement from the perspective of the self-supervised network, via diverse image synthesis experiments.
We design an adversarial strategy focusing on generating natural appearance changes of the subject, and against which we could expect a disentangled network to be robust.
arXiv Detail & Related papers (2023-09-20T22:22:21Z) - TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose
Estimation [55.94900327396771]
We introduce neural texture learning for 6D object pose estimation from synthetic data.
We learn to predict realistic texture of objects from real image collections.
We learn pose estimation from pixel-perfect synthetic data.
arXiv Detail & Related papers (2022-12-25T13:36:32Z) - Prior-Aware Synthetic Data to the Rescue: Animal Pose Estimation with
Very Limited Real Data [18.06492246414256]
We present a data efficient strategy for pose estimation in quadrupeds that requires only a small amount of real images from the target animal.
It is confirmed that fine-tuning a backbone network with pretrained weights on generic image datasets such as ImageNet can mitigate the high demand for target animal pose data.
We introduce a prior-aware synthetic animal data generation pipeline called PASyn to augment the animal pose data essential for robust pose estimation.
arXiv Detail & Related papers (2022-08-30T01:17:50Z) - AggPose: Deep Aggregation Vision Transformer for Infant Pose Estimation [6.9000851935487075]
We propose infant pose dataset and Deep Aggregation Vision Transformer for human pose estimation.
AggPose is a fast trained full transformer framework without using convolution operations to extract features in the early stages.
We show that AggPose could effectively learn the multi-scale features among different resolutions and significantly improve the performance of infant pose estimation.
arXiv Detail & Related papers (2022-05-11T05:34:14Z) - Unsupervised Domain Adaptation Learning for Hierarchical Infant Pose
Recognition with Synthetic Data [28.729049747477085]
We present a CNN-based model which takes any infant image as input and predicts the coarse and fine-level pose labels.
Our experimental results show that the proposed method can significantly align the distribution of synthetic and real-world datasets.
arXiv Detail & Related papers (2022-05-04T04:59:26Z) - Unsupervised Human Pose Estimation through Transforming Shape Templates [2.729524133721473]
We present a novel method for learning pose estimators for human adults and infants in an unsupervised fashion.
We demonstrate the effectiveness of our approach on two different datasets including adults and infants.
arXiv Detail & Related papers (2021-05-10T07:15:56Z) - FixMyPose: Pose Correctional Captioning and Retrieval [67.20888060019028]
We introduce a new captioning dataset named FixMyPose to address automated pose correction systems.
We collect descriptions of correcting a "current" pose to look like a "target" pose.
To avoid ML biases, we maintain a balance across characters with diverse demographics.
arXiv Detail & Related papers (2021-04-04T21:45:44Z) - Unsupervised 3D Human Pose Representation with Viewpoint and Pose
Disentanglement [63.853412753242615]
Learning a good 3D human pose representation is important for human pose related tasks.
We propose a novel Siamese denoising autoencoder to learn a 3D pose representation.
Our approach achieves state-of-the-art performance on two inherently different tasks.
arXiv Detail & Related papers (2020-07-14T14:25:22Z) - Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image
Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames.
Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z) - Deformation-aware Unpaired Image Translation for Pose Estimation on
Laboratory Animals [56.65062746564091]
We aim to capture the pose of neuroscience model organisms, without using any manual supervision, to study how neural circuits orchestrate behaviour.
Our key contribution is the explicit and independent modeling of appearance, shape and poses in an unpaired image translation framework.
We demonstrate improved pose estimation accuracy on Drosophila melanogaster (fruit fly), Caenorhabditis elegans (worm) and Danio rerio (zebrafish)
arXiv Detail & Related papers (2020-01-23T15:34:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.