WheelPose: Data Synthesis Techniques to Improve Pose Estimation Performance on Wheelchair Users
- URL: http://arxiv.org/abs/2404.17063v1
- Date: Thu, 25 Apr 2024 22:17:32 GMT
- Title: WheelPose: Data Synthesis Techniques to Improve Pose Estimation Performance on Wheelchair Users
- Authors: William Huang, Sam Ghahremani, Siyou Pei, Yang Zhang,
- Abstract summary: Existing pose estimation models perform poorly on wheelchair users due to a lack of representation in training data.
We present a data synthesis pipeline to address this disparity in data collection.
Our pipeline generates synthetic data of wheelchair users using motion capture data and motion generation outputs simulated in the Unity game engine.
- Score: 5.057643544417776
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing pose estimation models perform poorly on wheelchair users due to a lack of representation in training data. We present a data synthesis pipeline to address this disparity in data collection and subsequently improve pose estimation performance for wheelchair users. Our configurable pipeline generates synthetic data of wheelchair users using motion capture data and motion generation outputs simulated in the Unity game engine. We validated our pipeline by conducting a human evaluation, investigating perceived realism, diversity, and an AI performance evaluation on a set of synthetic datasets from our pipeline that synthesized different backgrounds, models, and postures. We found our generated datasets were perceived as realistic by human evaluators, had more diversity than existing image datasets, and had improved person detection and pose estimation performance when fine-tuned on existing pose estimation models. Through this work, we hope to create a foothold for future efforts in tackling the inclusiveness of AI in a data-centric and human-centric manner with the data synthesis techniques demonstrated in this work. Finally, for future works to extend upon, we open source all code in this research and provide a fully configurable Unity Environment used to generate our datasets. In the case of any models we are unable to share due to redistribution and licensing policies, we provide detailed instructions on how to source and replace said models.
Related papers
- Massively Annotated Datasets for Assessment of Synthetic and Real Data in Face Recognition [0.2775636978045794]
We study the drift between the performance of models trained on real and synthetic datasets.
We conduct studies on the differences between real and synthetic datasets on the attribute set.
Interestingly enough, we have verified that while real samples suffice to explain the synthetic distribution, the opposite could not be further from being true.
arXiv Detail & Related papers (2024-04-23T17:10:49Z) - LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free
Environment [59.320414108383055]
We present LiveHPS, a novel single-LiDAR-based approach for scene-level human pose and shape estimation.
We propose a huge human motion dataset, named FreeMotion, which is collected in various scenarios with diverse human poses.
arXiv Detail & Related papers (2024-02-27T03:08:44Z) - Learning Human Action Recognition Representations Without Real Humans [66.61527869763819]
We present a benchmark that leverages real-world videos with humans removed and synthetic data containing virtual humans to pre-train a model.
We then evaluate the transferability of the representation learned on this data to a diverse set of downstream action recognition benchmarks.
Our approach outperforms previous baselines by up to 5%.
arXiv Detail & Related papers (2023-11-10T18:38:14Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - Improving 2D Human Pose Estimation in Rare Camera Views with Synthetic Data [24.63316659365843]
We introduce RePoGen, an SMPL-based method for generating synthetic humans with comprehensive control over pose and view.
Experiments on top-view datasets and a new dataset of real images with diverse poses show that adding the RePoGen data to the COCO dataset outperforms previous approaches.
arXiv Detail & Related papers (2023-07-13T13:17:50Z) - Exploring the Effectiveness of Dataset Synthesis: An application of
Apple Detection in Orchards [68.95806641664713]
We explore the usability of Stable Diffusion 2.1-base for generating synthetic datasets of apple trees for object detection.
We train a YOLOv5m object detection model to predict apples in a real-world apple detection dataset.
Results demonstrate that the model trained on generated data is slightly underperforming compared to a baseline model trained on real-world images.
arXiv Detail & Related papers (2023-06-20T09:46:01Z) - Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task.
We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z) - Learning from synthetic data generated with GRADE [0.6982738885923204]
We present a framework for generating realistic animated dynamic environments (GRADE) for robotics research.
GRADE supports full simulation control, ROS integration, realistic physics, while being in an engine that produces high visual fidelity images and ground truth data.
We show that, even training using only synthetic data, can generalize well to real-world images in the same application domain.
arXiv Detail & Related papers (2023-05-07T14:13:04Z) - Development of a Realistic Crowd Simulation Environment for Fine-grained
Validation of People Tracking Methods [0.7223361655030193]
This work develops an extension of crowd simulation (named CrowdSim2) and prove its usability in the application of people-tracking algorithms.
The simulator is developed using the very popular Unity 3D engine with particular emphasis on the aspects of realism in the environment.
Three methods of tracking were used to validate generated dataset: IOU-Tracker, Deep-Sort, and Deep-TAMA.
arXiv Detail & Related papers (2023-04-26T09:29:58Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - PeopleSansPeople: A Synthetic Data Generator for Human-Centric Computer
Vision [3.5694949627557846]
We release a human-centric synthetic data generator PeopleSansPeople.
It contains simulation-ready 3D human assets, a parameterized lighting and camera system, and generates 2D and 3D bounding box, instance and semantic segmentation, and COCO pose labels.
arXiv Detail & Related papers (2021-12-17T02:33:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.