Reducing Training Demands for 3D Gait Recognition with Deep Koopman
Operator Constraints
- URL: http://arxiv.org/abs/2308.07468v1
- Date: Mon, 14 Aug 2023 21:39:33 GMT
- Title: Reducing Training Demands for 3D Gait Recognition with Deep Koopman
Operator Constraints
- Authors: Cole Hill, Mauricio Pamplona Segundo, Sudeep Sarkar
- Abstract summary: We introduce a new Linear Dynamical Systems (LDS) module and loss based on Koopman operator theory, which provides an unsupervised motion regularization for the periodic nature of gait.
We also show that our 3D modeling approach is much better than other 3D gait approaches in overcoming viewpoint variation under normal, bag-carrying and clothing change conditions.
- Score: 8.382355998881879
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Deep learning research has made many biometric recognition solution viable,
but it requires vast training data to achieve real-world generalization. Unlike
other biometric traits, such as face and ear, gait samples cannot be easily
crawled from the web to form massive unconstrained datasets. As the human body
has been extensively studied for different digital applications, one can rely
on prior shape knowledge to overcome data scarcity. This work follows the
recent trend of fitting a 3D deformable body model into gait videos using deep
neural networks to obtain disentangled shape and pose representations for each
frame. To enforce temporal consistency in the network, we introduce a new
Linear Dynamical Systems (LDS) module and loss based on Koopman operator
theory, which provides an unsupervised motion regularization for the periodic
nature of gait, as well as a predictive capacity for extending gait sequences.
We compare LDS to the traditional adversarial training approach and use the USF
HumanID and CASIA-B datasets to show that LDS can obtain better accuracy with
less training data. Finally, we also show that our 3D modeling approach is much
better than other 3D gait approaches in overcoming viewpoint variation under
normal, bag-carrying and clothing change conditions.
Related papers
- Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - A New Benchmark: On the Utility of Synthetic Data with Blender for Bare
Supervised Learning and Downstream Domain Adaptation [42.2398858786125]
Deep learning in computer vision has achieved great success with the price of large-scale labeled training data.
The uncontrollable data collection process produces non-IID training and test data, where undesired duplication may exist.
To circumvent them, an alternative is to generate synthetic data via 3D rendering with domain randomization.
arXiv Detail & Related papers (2023-03-16T09:03:52Z) - Recovering 3D Human Mesh from Monocular Images: A Survey [49.00136388529404]
Estimating human pose and shape from monocular images is a long-standing problem in computer vision.
This survey focuses on the task of monocular 3D human mesh recovery.
arXiv Detail & Related papers (2022-03-03T18:56:08Z) - Advancing 3D Medical Image Analysis with Variable Dimension Transform
based Supervised 3D Pre-training [45.90045513731704]
This paper revisits an innovative yet simple fully-supervised 3D network pre-training framework.
With a redesigned 3D network architecture, reformulated natural images are used to address the problem of data scarcity.
Comprehensive experiments on four benchmark datasets demonstrate that the proposed pre-trained models can effectively accelerate convergence.
arXiv Detail & Related papers (2022-01-05T03:11:21Z) - LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human
Bodies [78.17425779503047]
We propose a novel neural implicit representation for the human body.
It is fully differentiable and optimizable with disentangled shape and pose latent spaces.
Our model can be trained and fine-tuned directly on non-watertight raw data with well-designed losses.
arXiv Detail & Related papers (2021-11-30T04:10:57Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z) - Cascaded deep monocular 3D human pose estimation with evolutionary
training data [76.3478675752847]
Deep representation learning has achieved remarkable accuracy for monocular 3D human pose estimation.
This paper proposes a novel data augmentation method that is scalable for massive amount of training data.
Our method synthesizes unseen 3D human skeletons based on a hierarchical human representation and synthesizings inspired by prior knowledge.
arXiv Detail & Related papers (2020-06-14T03:09:52Z) - Exemplar Fine-Tuning for 3D Human Model Fitting Towards In-the-Wild 3D
Human Pose Estimation [107.07047303858664]
Large-scale human datasets with 3D ground-truth annotations are difficult to obtain in the wild.
We address this problem by augmenting existing 2D datasets with high-quality 3D pose fits.
The resulting annotations are sufficient to train from scratch 3D pose regressor networks that outperform the current state-of-the-art on in-the-wild benchmarks.
arXiv Detail & Related papers (2020-04-07T20:21:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.