Unsupervised Domain Adaptation Learning for Hierarchical Infant Pose
Recognition with Synthetic Data
- URL: http://arxiv.org/abs/2205.01892v1
- Date: Wed, 4 May 2022 04:59:26 GMT
- Title: Unsupervised Domain Adaptation Learning for Hierarchical Infant Pose
Recognition with Synthetic Data
- Authors: Cheng-Yen Yang, Zhongyu Jiang, Shih-Yu Gu, Jenq-Neng Hwang, Jang-Hee
Yoo
- Abstract summary: We present a CNN-based model which takes any infant image as input and predicts the coarse and fine-level pose labels.
Our experimental results show that the proposed method can significantly align the distribution of synthetic and real-world datasets.
- Score: 28.729049747477085
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Alberta Infant Motor Scale (AIMS) is a well-known assessment scheme that
evaluates the gross motor development of infants by recording the number of
specific poses achieved. With the aid of the image-based pose recognition
model, the AIMS evaluation procedure can be shortened and automated, providing
early diagnosis or indicator of potential developmental disorder. Due to
limited public infant-related datasets, many works use the SMIL-based method to
generate synthetic infant images for training. However, this domain mismatch
between real and synthetic training samples often leads to performance
degradation during inference. In this paper, we present a CNN-based model which
takes any infant image as input and predicts the coarse and fine-level pose
labels. The model consists of an image branch and a pose branch, which
respectively generates the coarse-level logits facilitated by the unsupervised
domain adaptation and the 3D keypoints using the HRNet with SMPLify
optimization. Then the outputs of these branches will be sent into the
hierarchical pose recognition module to estimate the fine-level pose labels. We
also collect and label a new AIMS dataset, which contains 750 real and 4000
synthetic infants images with AIMS pose labels. Our experimental results show
that the proposed method can significantly align the distribution of synthetic
and real-world datasets, thus achieving accurate performance on fine-grained
infant pose recognition.
Related papers
- Improving Interpretability and Robustness for the Detection of AI-Generated Images [6.116075037154215]
We analyze existing state-of-the-art AIGI detection methods based on frozen CLIP embeddings.
We show how to interpret them, shedding light on how images produced by various AI generators differ from real ones.
arXiv Detail & Related papers (2024-06-21T10:33:09Z) - Scaling Laws of Synthetic Images for Model Training ... for Now [54.43596959598466]
We study the scaling laws of synthetic images generated by state of the art text-to-image models.
We observe that synthetic images demonstrate a scaling trend similar to, but slightly less effective than, real images in CLIP training.
arXiv Detail & Related papers (2023-12-07T18:59:59Z) - On quantifying and improving realism of images generated with diffusion [50.37578424163951]
We propose a metric, called Image Realism Score (IRS), computed from five statistical measures of a given image.
IRS is easily usable as a measure to classify a given image as real or fake.
We experimentally establish the model- and data-agnostic nature of the proposed IRS by successfully detecting fake images generated by Stable Diffusion Model (SDM), Dalle2, Midjourney and BigGAN.
Our efforts have also led to Gen-100 dataset, which provides 1,000 samples for 100 classes generated by four high-quality models.
arXiv Detail & Related papers (2023-09-26T08:32:55Z) - Semi-supervised Body Parsing and Pose Estimation for Enhancing Infant
General Movement Assessment [11.33138866472943]
General movement assessment (GMA) of infant movement videos (IMVs) is an effective method for early detection of cerebral palsy (CP) in infants.
We demonstrate in this paper that end-to-end trainable neural networks for image sequence recognition can be applied to achieve good results in GMA.
We propose a semi-supervised model, termed SiamParseNet (SPN), which consists of two branches, one for intra-frame body parts segmentation and another for inter-frame label propagation.
arXiv Detail & Related papers (2022-10-14T18:46:30Z) - High-resolution semantically-consistent image-to-image translation [0.0]
This paper proposes an unsupervised domain adaptation model that preserves semantic consistency and per-pixel quality for the images during the style-transferring phase.
The proposed model shows substantial performance gain compared to the SemI2I model and reaches similar results as the state-of-the-art CyCADA model.
arXiv Detail & Related papers (2022-09-13T19:08:30Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - Generative Self-training for Cross-domain Unsupervised Tagged-to-Cine
MRI Synthesis [10.636015177721635]
We propose a novel generative self-training framework with continuous value prediction and regression objective for cross-domain image synthesis.
Specifically, we propose to filter the pseudo-label with an uncertainty mask, and quantify the predictive confidence of generated images with practical variational Bayes learning.
arXiv Detail & Related papers (2021-06-23T16:19:00Z) - You Only Need Adversarial Supervision for Semantic Image Synthesis [84.83711654797342]
We propose a novel, simplified GAN model, which needs only adversarial supervision to achieve high quality results.
We show that images synthesized by our model are more diverse and follow the color and texture of real images more closely.
arXiv Detail & Related papers (2020-12-08T23:00:48Z) - Deep Low-Shot Learning for Biological Image Classification and
Visualization from Limited Training Samples [52.549928980694695]
In situ hybridization (ISH) gene expression pattern images from the same developmental stage are compared.
labeling training data with precise stages is very time-consuming even for biologists.
We propose a deep two-step low-shot learning framework to accurately classify ISH images using limited training images.
arXiv Detail & Related papers (2020-10-20T06:06:06Z) - Invariant Representation Learning for Infant Pose Estimation with Small
Data [14.91506452479778]
We release a hybrid synthetic and real infant pose dataset with small yet diverse real images as well as generated synthetic infant poses.
In our ablation study, with identical network structure, models trained on SyRIP dataset show noticeable improvement over the ones trained on the only other public infant pose datasets.
One of our best infant pose estimation performers on the state-of-the-art DarkPose model shows mean average precision (mAP) of 93.6.
arXiv Detail & Related papers (2020-10-13T01:10:14Z) - Improved Slice-wise Tumour Detection in Brain MRIs by Computing
Dissimilarities between Latent Representations [68.8204255655161]
Anomaly detection for Magnetic Resonance Images (MRIs) can be solved with unsupervised methods.
We have proposed a slice-wise semi-supervised method for tumour detection based on the computation of a dissimilarity function in the latent space of a Variational AutoEncoder.
We show that by training the models on higher resolution images and by improving the quality of the reconstructions, we obtain results which are comparable with different baselines.
arXiv Detail & Related papers (2020-07-24T14:02:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.