A Review of Deep Learning Techniques for Markerless Human Motion on
Synthetic Datasets
- URL: http://arxiv.org/abs/2201.02503v1
- Date: Fri, 7 Jan 2022 15:42:50 GMT
- Title: A Review of Deep Learning Techniques for Markerless Human Motion on
Synthetic Datasets
- Authors: Doan Duy Vo, Russell Butler
- Abstract summary: Estimating human posture has recently gained increasing attention in the computer vision community.
We present a model that can predict the skeleton of an animation based solely on 2D images.
The implementation process uses DeepLabCut on its own dataset to perform many necessary steps.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Markerless motion capture has become an active field of research in computer
vision in recent years. Its extensive applications are known in a great variety
of fields, including computer animation, human motion analysis, biomedical
research, virtual reality, and sports science. Estimating human posture has
recently gained increasing attention in the computer vision community, but due
to the depth of uncertainty and the lack of the synthetic datasets, it is a
challenging task. Various approaches have recently been proposed to solve this
problem, many of which are based on deep learning. They are primarily focused
on improving the performance of existing benchmarks with significant advances,
especially 2D images. Based on powerful deep learning techniques and recently
collected real-world datasets, we explored a model that can predict the
skeleton of an animation based solely on 2D images. Frames generated from
different real-world datasets with synthesized poses using different body
shapes from simple to complex. The implementation process uses DeepLabCut on
its own dataset to perform many necessary steps, then use the input frames to
train the model. The output is an animated skeleton for human movement. The
composite dataset and other results are the "ground truth" of the deep model.
Related papers
- DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity
Human-centric Rendering [126.00165445599764]
We present DNA-Rendering, a large-scale, high-fidelity repository of human performance data for neural actor rendering.
Our dataset contains over 1500 human subjects, 5000 motion sequences, and 67.5M frames' data volume.
We construct a professional multi-view system to capture data, which contains 60 synchronous cameras with max 4096 x 3000 resolution, 15 fps speed, and stern camera calibration steps.
arXiv Detail & Related papers (2023-07-19T17:58:03Z) - HabitatDyn Dataset: Dynamic Object Detection to Kinematics Estimation [16.36110033895749]
We propose the dataset HabitatDyn, which contains both synthetic RGB videos, semantic labels, and depth information, as well as kinetics information.
HabitatDyn was created from the perspective of a mobile robot with a moving camera, and contains 30 scenes featuring six different types of moving objects with varying velocities.
arXiv Detail & Related papers (2023-04-21T09:57:35Z) - Learning 3D Human Pose Estimation from Dozens of Datasets using a
Geometry-Aware Autoencoder to Bridge Between Skeleton Formats [80.12253291709673]
We propose a novel affine-combining autoencoder (ACAE) method to perform dimensionality reduction on the number of landmarks.
Our approach scales to an extreme multi-dataset regime, where we use 28 3D human pose datasets to supervise one model.
arXiv Detail & Related papers (2022-12-29T22:22:49Z) - Adversarial Attention for Human Motion Synthesis [3.9378507882929563]
We present a novel method for controllable human motion synthesis by applying attention-based probabilistic deep adversarial models with end-to-end training.
We show that we can generate synthetic human motion over both short- and long-time horizons through the use of adversarial attention.
arXiv Detail & Related papers (2022-04-25T16:12:42Z) - Learning Dynamic View Synthesis With Few RGBD Cameras [60.36357774688289]
We propose to utilize RGBD cameras to synthesize free-viewpoint videos of dynamic indoor scenes.
We generate point clouds from RGBD frames and then render them into free-viewpoint videos via a neural feature.
We introduce a simple Regional Depth-Inpainting module that adaptively inpaints missing depth values to render complete novel views.
arXiv Detail & Related papers (2022-04-22T03:17:35Z) - Recovering 3D Human Mesh from Monocular Images: A Survey [49.00136388529404]
Estimating human pose and shape from monocular images is a long-standing problem in computer vision.
This survey focuses on the task of monocular 3D human mesh recovery.
arXiv Detail & Related papers (2022-03-03T18:56:08Z) - HSPACE: Synthetic Parametric Humans Animated in Complex Environments [67.8628917474705]
We build a large-scale photo-realistic dataset, Human-SPACE, of animated humans placed in complex indoor and outdoor environments.
We combine a hundred diverse individuals of varying ages, gender, proportions, and ethnicity, with hundreds of motions and scenes, in order to generate an initial dataset of over 1 million frames.
Assets are generated automatically, at scale, and are compatible with existing real time rendering and game engines.
arXiv Detail & Related papers (2021-12-23T22:27:55Z) - REGRAD: A Large-Scale Relational Grasp Dataset for Safe and
Object-Specific Robotic Grasping in Clutter [52.117388513480435]
We present a new dataset named regrad to sustain the modeling of relationships among objects and grasps.
Our dataset is collected in both forms of 2D images and 3D point clouds.
Users are free to import their own object models for the generation of as many data as they want.
arXiv Detail & Related papers (2021-04-29T05:31:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.