FIND: An Unsupervised Implicit 3D Model of Articulated Human Feet
- URL: http://arxiv.org/abs/2210.12241v1
- Date: Fri, 21 Oct 2022 20:47:16 GMT
- Title: FIND: An Unsupervised Implicit 3D Model of Articulated Human Feet
- Authors: Oliver Boyne, James Charles, Roberto Cipolla
- Abstract summary: We present a high fidelity and articulated 3D human foot model.
The model is parameterised by a disentangled latent code in terms of shape, texture and articulated pose.
- Score: 27.85606375080643
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper we present a high fidelity and articulated 3D human foot model.
The model is parameterised by a disentangled latent code in terms of shape,
texture and articulated pose. While high fidelity models are typically created
with strong supervision such as 3D keypoint correspondences or
pre-registration, we focus on the difficult case of little to no annotation. To
this end, we make the following contributions: (i) we develop a Foot Implicit
Neural Deformation field model, named FIND, capable of tailoring explicit
meshes at any resolution i.e. for low or high powered devices; (ii) an approach
for training our model in various modes of weak supervision with progressively
better disentanglement as more labels, such as pose categories, are provided;
(iii) a novel unsupervised part-based loss for fitting our model to 2D images
which is better than traditional photometric or silhouette losses; (iv)
finally, we release a new dataset of high resolution 3D human foot scans,
Foot3D. On this dataset, we show our model outperforms a strong PCA
implementation trained on the same data in terms of shape quality and part
correspondences, and that our novel unsupervised part-based loss improves
inference on images.
Related papers
- Personalized 3D Human Pose and Shape Refinement [19.082329060985455]
regression-based methods have dominated the field of 3D human pose and shape estimation.
We propose to construct dense correspondences between initial human model estimates and the corresponding images.
We show that our approach not only consistently leads to better image-model alignment, but also to improved 3D accuracy.
arXiv Detail & Related papers (2024-03-18T10:13:53Z) - ManiPose: Manifold-Constrained Multi-Hypothesis 3D Human Pose Estimation [54.86887812687023]
Most 3D-HPE methods rely on regression models, which assume a one-to-one mapping between inputs and outputs.
We propose ManiPose, a novel manifold-constrained multi-hypothesis model capable of proposing multiple candidate 3D poses for each 2D input.
Unlike previous multi-hypothesis approaches, our solution is completely supervised and does not rely on complex generative models.
arXiv Detail & Related papers (2023-12-11T13:50:10Z) - IT3D: Improved Text-to-3D Generation with Explicit View Synthesis [71.68595192524843]
This study presents a novel strategy that leverages explicitly synthesized multi-view images to address these issues.
Our approach involves the utilization of image-to-image pipelines, empowered by LDMs, to generate posed high-quality images.
For the incorporated discriminator, the synthesized multi-view images are considered real data, while the renderings of the optimized 3D models function as fake data.
arXiv Detail & Related papers (2023-08-22T14:39:17Z) - Deformable Model-Driven Neural Rendering for High-Fidelity 3D
Reconstruction of Human Heads Under Low-View Settings [20.07788905506271]
Reconstructing 3D human heads in low-view settings presents technical challenges.
We propose geometry decomposition and adopt a two-stage, coarse-to-fine training strategy.
Our method outperforms existing neural rendering approaches in terms of reconstruction accuracy and novel view synthesis under low-view settings.
arXiv Detail & Related papers (2023-03-24T08:32:00Z) - 3D Equivariant Molecular Graph Pretraining [42.957880677779556]
We tackle 3D molecular pretraining in a complete and novel sense.
We first propose to adopt an equivariant energy-based model as the backbone for pretraining, which enjoys the merit of fulfilling the symmetry of 3D space.
We evaluate our model pretrained from a large-scale 3D dataset GEOM-QM9 on two challenging 3D benchmarks: MD17 and QM9.
arXiv Detail & Related papers (2022-07-18T16:26:24Z) - Learned Vertex Descent: A New Direction for 3D Human Model Fitting [64.04726230507258]
We propose a novel optimization-based paradigm for 3D human model fitting on images and scans.
Our approach is able to capture the underlying body of clothed people with very different body shapes, achieving a significant improvement compared to state-of-the-art.
LVD is also applicable to 3D model fitting of humans and hands, for which we show a significant improvement to the SOTA with a much simpler and faster method.
arXiv Detail & Related papers (2022-05-12T17:55:51Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z) - Kinematic-Structure-Preserved Representation for Unsupervised 3D Human
Pose Estimation [58.72192168935338]
Generalizability of human pose estimation models developed using supervision on large-scale in-studio datasets remains questionable.
We propose a novel kinematic-structure-preserved unsupervised 3D pose estimation framework, which is not restrained by any paired or unpaired weak supervisions.
Our proposed model employs three consecutive differentiable transformations named as forward-kinematics, camera-projection and spatial-map transformation.
arXiv Detail & Related papers (2020-06-24T23:56:33Z) - Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image
Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames.
Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.