HandFlow: Quantifying View-Dependent 3D Ambiguity in Two-Hand
Reconstruction with Normalizing Flow
- URL: http://arxiv.org/abs/2210.01692v1
- Date: Tue, 4 Oct 2022 15:42:22 GMT
- Title: HandFlow: Quantifying View-Dependent 3D Ambiguity in Two-Hand
Reconstruction with Normalizing Flow
- Authors: Jiayi Wang and Diogo Luvizon and Franziska Mueller and Florian Bernard
and Adam Kortylewski and Dan Casas and Christian Theobalt
- Abstract summary: We explicitly model the distribution of plausible reconstructions in a conditional normalizing flow framework.
We show that explicit ambiguity modeling is better-suited for this challenging problem.
- Score: 73.7895717883622
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Reconstructing two-hand interactions from a single image is a challenging
problem due to ambiguities that stem from projective geometry and heavy
occlusions. Existing methods are designed to estimate only a single pose,
despite the fact that there exist other valid reconstructions that fit the
image evidence equally well. In this paper we propose to address this issue by
explicitly modeling the distribution of plausible reconstructions in a
conditional normalizing flow framework. This allows us to directly supervise
the posterior distribution through a novel determinant magnitude
regularization, which is key to varied 3D hand pose samples that project well
into the input image. We also demonstrate that metrics commonly used to assess
reconstruction quality are insufficient to evaluate pose predictions under such
severe ambiguity. To address this, we release the first dataset with multiple
plausible annotations per image called MultiHands. The additional annotations
enable us to evaluate the estimated distribution using the maximum mean
discrepancy metric. Through this, we demonstrate the quality of our
probabilistic reconstruction and show that explicit ambiguity modeling is
better-suited for this challenging problem.
Related papers
- OFER: Occluded Face Expression Reconstruction [16.06622406877353]
We introduce OFER, a novel approach for single image 3D face reconstruction that can generate plausible, diverse, and expressive 3D faces.
We propose a novel ranking mechanism that sorts the outputs of the shape diffusion network based on the predicted shape accuracy scores to select the best match.
arXiv Detail & Related papers (2024-10-29T00:21:26Z) - DiffPose: Multi-hypothesis Human Pose Estimation using Diffusion models [5.908471365011943]
We propose emphDiffPose, a conditional diffusion model that predicts multiple hypotheses for a given input image.
We show that DiffPose slightly improves upon the state of the art for multi-hypothesis pose estimation for simple poses and outperforms it by a large margin for highly ambiguous poses.
arXiv Detail & Related papers (2022-11-29T18:55:13Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - Probabilistic Modeling for Human Mesh Recovery [73.11532990173441]
This paper focuses on the problem of 3D human reconstruction from 2D evidence.
We recast the problem as learning a mapping from the input to a distribution of plausible 3D poses.
arXiv Detail & Related papers (2021-08-26T17:55:11Z) - Implicit-PDF: Non-Parametric Representation of Probability Distributions
on the Rotation Manifold [47.31074799708132]
We introduce a method to estimate arbitrary, non-parametric distributions on SO(3).
Our key idea is to represent the distributions implicitly, with a neural network that estimates the probability given the input image and a candidate pose.
We achieve state-of-the-art performance on Pascal3D+ and ModelNet10-SO(3) benchmarks.
arXiv Detail & Related papers (2021-06-10T17:57:23Z) - Deep Bingham Networks: Dealing with Uncertainty and Ambiguity in Pose
Estimation [74.76155168705975]
Deep Bingham Networks (DBN) can handle pose-related uncertainties and ambiguities arising in almost all real life applications concerning 3D data.
DBN extends the state of the art direct pose regression networks by (i) a multi-hypotheses prediction head which can yield different distribution modes.
We propose new training strategies so as to avoid mode or posterior collapse during training and to improve numerical stability.
arXiv Detail & Related papers (2020-12-20T19:20:26Z) - Weakly Supervised Generative Network for Multiple 3D Human Pose
Hypotheses [74.48263583706712]
3D human pose estimation from a single image is an inverse problem due to the inherent ambiguity of the missing depth.
We propose a weakly supervised deep generative network to address the inverse problem.
arXiv Detail & Related papers (2020-08-13T09:26:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.