Body Knowledge and Uncertainty Modeling for Monocular 3D Human Body
Reconstruction
- URL: http://arxiv.org/abs/2308.00799v1
- Date: Tue, 1 Aug 2023 19:29:10 GMT
- Title: Body Knowledge and Uncertainty Modeling for Monocular 3D Human Body
Reconstruction
- Authors: Yufei Zhang, Hanjing Wang, Jeffrey O. Kephart, Qiang Ji
- Abstract summary: KNOWN exploits a comprehensive set of generic body constraints derived from well-established body knowledge.
KNOWN's body reconstruction outperforms prior weakly-supervised approaches.
- Score: 37.167714468046924
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While 3D body reconstruction methods have made remarkable progress recently,
it remains difficult to acquire the sufficiently accurate and numerous 3D
supervisions required for training. In this paper, we propose \textbf{KNOWN}, a
framework that effectively utilizes body \textbf{KNOW}ledge and
u\textbf{N}certainty modeling to compensate for insufficient 3D supervisions.
KNOWN exploits a comprehensive set of generic body constraints derived from
well-established body knowledge. These generic constraints precisely and
explicitly characterize the reconstruction plausibility and enable 3D
reconstruction models to be trained without any 3D data. Moreover, existing
methods typically use images from multiple datasets during training, which can
result in data noise (\textit{e.g.}, inconsistent joint annotation) and data
imbalance (\textit{e.g.}, minority images representing unusual poses or
captured from challenging camera views). KNOWN solves these problems through a
novel probabilistic framework that models both aleatoric and epistemic
uncertainty. Aleatoric uncertainty is encoded in a robust Negative
Log-Likelihood (NLL) training loss, while epistemic uncertainty is used to
guide model refinement. Experiments demonstrate that KNOWN's body
reconstruction outperforms prior weakly-supervised approaches, particularly on
the challenging minority images.
Related papers
- DeProPose: Deficiency-Proof 3D Human Pose Estimation via Adaptive Multi-View Fusion [57.83515140886807]
We introduce the task of Deficiency-Aware 3D Pose Estimation.
DeProPose is a flexible method that simplifies the network architecture to reduce training complexity.
We have developed a novel 3D human pose estimation dataset.
arXiv Detail & Related papers (2025-02-23T03:22:54Z) - FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with
Pre-trained Vision-Language Models [62.663113296987085]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data.
We introduce two novel components: the Redundant Feature Eliminator (RFE) and the Spatial Noise Compensator (SNC)
Considering the imbalance in existing 3D datasets, we also propose new evaluation metrics that offer a more nuanced assessment of a 3D FSCIL model.
arXiv Detail & Related papers (2023-12-28T14:52:07Z) - STRIDE: Single-video based Temporally Continuous Occlusion Robust 3D Pose Estimation [27.854074900345314]
We propose STRIDE, a novel Test-Time Training (TTT) approach to fit a human motion prior to each video.
Our framework demonstrates flexibility by being model-agnostic, allowing us to use any off-the-shelf 3D pose estimation method for improving robustness and temporal consistency.
We validate STRIDE's efficacy through comprehensive experiments on challenging datasets like Occluded Human3.6M, Human3.6M, and OCMotion.
arXiv Detail & Related papers (2023-12-24T11:05:10Z) - Multi-view 3D Object Reconstruction and Uncertainty Modelling with
Neural Shape Prior [9.716201630968433]
3D object reconstruction is important for semantic scene understanding.
It is challenging to reconstruct detailed 3D shapes from monocular images directly due to a lack of depth information, occlusion and noise.
We tackle this problem by leveraging a neural object representation which learns an object shape distribution from large dataset of 3d object models and maps it into a latent space.
We propose a method to model uncertainty as part of the representation and define an uncertainty-aware encoder which generates latent codes with uncertainty directly from individual input images.
arXiv Detail & Related papers (2023-06-17T03:25:13Z) - FIND: An Unsupervised Implicit 3D Model of Articulated Human Feet [27.85606375080643]
We present a high fidelity and articulated 3D human foot model.
The model is parameterised by a disentangled latent code in terms of shape, texture and articulated pose.
arXiv Detail & Related papers (2022-10-21T20:47:16Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - Advancing 3D Medical Image Analysis with Variable Dimension Transform
based Supervised 3D Pre-training [45.90045513731704]
This paper revisits an innovative yet simple fully-supervised 3D network pre-training framework.
With a redesigned 3D network architecture, reformulated natural images are used to address the problem of data scarcity.
Comprehensive experiments on four benchmark datasets demonstrate that the proposed pre-trained models can effectively accelerate convergence.
arXiv Detail & Related papers (2022-01-05T03:11:21Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z) - PaMIR: Parametric Model-Conditioned Implicit Representation for
Image-based Human Reconstruction [67.08350202974434]
We propose Parametric Model-Conditioned Implicit Representation (PaMIR), which combines the parametric body model with the free-form deep implicit function.
We show that our method achieves state-of-the-art performance for image-based 3D human reconstruction in the cases of challenging poses and clothing types.
arXiv Detail & Related papers (2020-07-08T02:26:19Z) - Kinematic-Structure-Preserved Representation for Unsupervised 3D Human
Pose Estimation [58.72192168935338]
Generalizability of human pose estimation models developed using supervision on large-scale in-studio datasets remains questionable.
We propose a novel kinematic-structure-preserved unsupervised 3D pose estimation framework, which is not restrained by any paired or unpaired weak supervisions.
Our proposed model employs three consecutive differentiable transformations named as forward-kinematics, camera-projection and spatial-map transformation.
arXiv Detail & Related papers (2020-06-24T23:56:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.