EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for
Monocular Object Pose Estimation
- URL: http://arxiv.org/abs/2203.13254v1
- Date: Thu, 24 Mar 2022 17:59:49 GMT
- Title: EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for
Monocular Object Pose Estimation
- Authors: Hansheng Chen, Pichao Wang, Fan Wang, Wei Tian, Lu Xiong, Hao Li
- Abstract summary: Locating 3D objects from a single RGB image via Perspective-n-Points is a long-standing problem in computer vision.
Recent studies suggest.
a differentiable layer so that 2D-3D point.
correspondences can be partly learned by backagating the object pose.
Yet the entire set of 2D-3D points from scratch fails to converge with existing approaches.
- Score: 22.672080094222082
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Locating 3D objects from a single RGB image via Perspective-n-Points (PnP) is
a long-standing problem in computer vision. Driven by end-to-end deep learning,
recent studies suggest interpreting PnP as a differentiable layer, so that
2D-3D point correspondences can be partly learned by backpropagating the
gradient w.r.t. object pose. Yet, learning the entire set of unrestricted 2D-3D
points from scratch fails to converge with existing approaches, since the
deterministic pose is inherently non-differentiable. In this paper, we propose
the EPro-PnP, a probabilistic PnP layer for general end-to-end pose estimation,
which outputs a distribution of pose on the SE(3) manifold, essentially
bringing categorical Softmax to the continuous domain. The 2D-3D coordinates
and corresponding weights are treated as intermediate variables learned by
minimizing the KL divergence between the predicted and target pose
distribution. The underlying principle unifies the existing approaches and
resembles the attention mechanism. EPro-PnP significantly outperforms
competitive baselines, closing the gap between PnP-based method and the
task-specific leaders on the LineMOD 6DoF pose estimation and nuScenes 3D
object detection benchmarks.
Related papers
- EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for
Monocular Object Pose Estimation [30.212903535850874]
Locating 3D objects from a single RGB image via Perspective-n-Point is a long-standing problem in computer vision.
EPro-Scene can enhance existing correspondence networks, closing the gap between MOD-based method and the Line 6DoF pose estimation benchmark.
arXiv Detail & Related papers (2023-03-22T17:57:36Z) - Diffusion-Based 3D Human Pose Estimation with Multi-Hypothesis
Aggregation [64.874000550443]
A Diffusion-based 3D Pose estimation (D3DP) method with Joint-wise reProjection-based Multi-hypothesis Aggregation (JPMA) is proposed.
The proposed JPMA assembles multiple hypotheses generated by D3DP into a single 3D pose for practical use.
Our method outperforms the state-of-the-art deterministic and probabilistic approaches by 1.5% and 8.9%, respectively.
arXiv Detail & Related papers (2023-03-21T04:00:47Z) - Linear-Covariance Loss for End-to-End Learning of 6D Pose Estimation [64.12149365530624]
Most modern image-based 6D object pose estimation methods learn to predict 2D-3D correspondences, from which the pose can be obtained using a solver.
Here, we argue that this conflicts with the averaging nature of the problem leading to gradients that may encourage the network to degrade accuracy.
arXiv Detail & Related papers (2023-03-21T00:32:31Z) - Non-Local Latent Relation Distillation for Self-Adaptive 3D Human Pose
Estimation [63.199549837604444]
3D human pose estimation approaches leverage different forms of strong (2D/3D pose) or weak (multi-view or depth) paired supervision.
We cast 3D pose learning as a self-supervised adaptation problem that aims to transfer the task knowledge from a labeled source domain to a completely unpaired target.
We evaluate different self-adaptation settings and demonstrate state-of-the-art 3D human pose estimation performance on standard benchmarks.
arXiv Detail & Related papers (2022-04-05T03:52:57Z) - Uncertainty-Aware Camera Pose Estimation from Points and Lines [101.03675842534415]
Perspective-n-Point-and-Line (Pn$PL) aims at fast, accurate and robust camera localizations with respect to a 3D model from 2D-3D feature coordinates.
arXiv Detail & Related papers (2021-07-08T15:19:36Z) - Beyond Weak Perspective for Monocular 3D Human Pose Estimation [6.883305568568084]
We consider the task of 3D joints location and orientation prediction from a monocular video.
We first infer 2D joints locations with an off-the-shelf pose estimation algorithm.
We then adhere to the SMPLify algorithm which receives those initial parameters.
arXiv Detail & Related papers (2020-09-14T16:23:14Z) - Unsupervised Cross-Modal Alignment for Multi-Person 3D Pose Estimation [52.94078950641959]
We present a deployment friendly, fast bottom-up framework for multi-person 3D human pose estimation.
We adopt a novel neural representation of multi-person 3D pose which unifies the position of person instances with their corresponding 3D pose representation.
We propose a practical deployment paradigm where paired 2D or 3D pose annotations are unavailable.
arXiv Detail & Related papers (2020-08-04T07:54:25Z) - Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point
Problem [98.92148855291363]
This paper proposes a deep CNN model which simultaneously solves for both 6-DoF absolute camera pose 2D--3D correspondences.
Tests on both real and simulated data have shown that our method substantially outperforms existing approaches.
arXiv Detail & Related papers (2020-03-15T04:17:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.