ZebraPose: Zebra Detection and Pose Estimation using only Synthetic Data
- URL: http://arxiv.org/abs/2408.10831v1
- Date: Tue, 20 Aug 2024 13:28:37 GMT
- Title: ZebraPose: Zebra Detection and Pose Estimation using only Synthetic Data
- Authors: Elia Bonetto, Aamir Ahmad,
- Abstract summary: We use synthetic data generated with a 3D simulator to obtain the first synthetic dataset that can be used for both detection and 2D pose estimation of zebras.
We extensively train and benchmark our detection and 2D pose estimation models on multiple real-world and synthetic datasets.
These experiments show how the models trained from scratch and only with synthetic data can consistently generalize to real-world images of zebras.
- Score: 0.2302001830524133
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Synthetic data is increasingly being used to address the lack of labeled images in uncommon domains for deep learning tasks. A prominent example is 2D pose estimation of animals, particularly wild species like zebras, for which collecting real-world data is complex and impractical. However, many approaches still require real images, consistency and style constraints, sophisticated animal models, and/or powerful pre-trained networks to bridge the syn-to-real gap. Moreover, they often assume that the animal can be reliably detected in images or videos, a hypothesis that often does not hold, e.g. in wildlife scenarios or aerial images. To solve this, we use synthetic data generated with a 3D photorealistic simulator to obtain the first synthetic dataset that can be used for both detection and 2D pose estimation of zebras without applying any of the aforementioned bridging strategies. Unlike previous works, we extensively train and benchmark our detection and 2D pose estimation models on multiple real-world and synthetic datasets using both pre-trained and non-pre-trained backbones. These experiments show how the models trained from scratch and only with synthetic data can consistently generalize to real-world images of zebras in both tasks. Moreover, we show it is possible to easily generalize those same models to 2D pose estimation of horses with a minimal amount of real-world images to account for the domain transfer. Code, results, trained models; and the synthetic, training, and validation data, including 104K manually labeled frames, are provided as open-source at https://zebrapose.is.tue.mpg.de/
Related papers
- Learning the 3D Fauna of the Web [70.01196719128912]
We develop 3D-Fauna, an approach that learns a pan-category deformable 3D animal model for more than 100 animal species jointly.
One crucial bottleneck of modeling animals is the limited availability of training data.
We show that prior category-specific attempts fail to generalize to rare species with limited training images.
arXiv Detail & Related papers (2024-01-04T18:32:48Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - Of Mice and Pose: 2D Mouse Pose Estimation from Unlabelled Data and
Synthetic Prior [0.7499722271664145]
We propose an approach for estimating 2D mouse body pose from unlabelled images using a synthetically generated empirical pose prior.
We adapt this method to the limb structure of the mouse and generate the empirical prior of 2D poses from a synthetic 3D mouse model.
In experiments on a new mouse video dataset, we evaluate the performance of the approach by comparing pose predictions to a manually obtained ground truth.
arXiv Detail & Related papers (2023-07-25T09:31:55Z) - Synthetic Data-based Detection of Zebras in Drone Imagery [0.8249180979158817]
We present an approach for training an animal detector using only synthetic data.
The dataset includes RGB, depth, skeletal joint locations, pose, shape and instance segmentations for each subject.
We show that we can detect zebras by using only synthetic data during training.
arXiv Detail & Related papers (2023-04-30T09:24:31Z) - TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose
Estimation [55.94900327396771]
We introduce neural texture learning for 6D object pose estimation from synthetic data.
We learn to predict realistic texture of objects from real image collections.
We learn pose estimation from pixel-perfect synthetic data.
arXiv Detail & Related papers (2022-12-25T13:36:32Z) - Prior-Aware Synthetic Data to the Rescue: Animal Pose Estimation with
Very Limited Real Data [18.06492246414256]
We present a data efficient strategy for pose estimation in quadrupeds that requires only a small amount of real images from the target animal.
It is confirmed that fine-tuning a backbone network with pretrained weights on generic image datasets such as ImageNet can mitigate the high demand for target animal pose data.
We introduce a prior-aware synthetic animal data generation pipeline called PASyn to augment the animal pose data essential for robust pose estimation.
arXiv Detail & Related papers (2022-08-30T01:17:50Z) - Learning Dense Correspondence from Synthetic Environments [27.841736037738286]
Existing methods map manually labelled human pixels in real 2D images onto the 3D surface, which is prone to human error.
We propose to solve the problem of data scarcity by training 2D-3D human mapping algorithms using automatically generated synthetic data.
arXiv Detail & Related papers (2022-03-24T08:13:26Z) - Perspective Flow Aggregation for Data-Limited 6D Object Pose Estimation [121.02948087956955]
For some applications, such as those in space or deep under water, acquiring real images, even unannotated, is virtually impossible.
We propose a method that can be trained solely on synthetic images, or optionally using a few additional real images.
It performs on par with methods that require annotated real images for training when not using any, and outperforms them considerably when using as few as twenty real images.
arXiv Detail & Related papers (2022-03-18T10:20:21Z) - DynaDog+T: A Parametric Animal Model for Synthetic Canine Image
Generation [23.725295519857976]
We introduce a parametric canine model, DynaDog+T, for generating synthetic canine images and data.
We use this data for a common computer vision task, binary segmentation, which would otherwise be difficult due to the lack of available data.
arXiv Detail & Related papers (2021-07-15T13:53:10Z) - Cascaded deep monocular 3D human pose estimation with evolutionary
training data [76.3478675752847]
Deep representation learning has achieved remarkable accuracy for monocular 3D human pose estimation.
This paper proposes a novel data augmentation method that is scalable for massive amount of training data.
Our method synthesizes unseen 3D human skeletons based on a hierarchical human representation and synthesizings inspired by prior knowledge.
arXiv Detail & Related papers (2020-06-14T03:09:52Z) - Deformation-aware Unpaired Image Translation for Pose Estimation on
Laboratory Animals [56.65062746564091]
We aim to capture the pose of neuroscience model organisms, without using any manual supervision, to study how neural circuits orchestrate behaviour.
Our key contribution is the explicit and independent modeling of appearance, shape and poses in an unpaired image translation framework.
We demonstrate improved pose estimation accuracy on Drosophila melanogaster (fruit fly), Caenorhabditis elegans (worm) and Danio rerio (zebrafish)
arXiv Detail & Related papers (2020-01-23T15:34:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.