EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for
Monocular Object Pose Estimation
- URL: http://arxiv.org/abs/2303.12787v3
- Date: Sun, 17 Dec 2023 08:30:49 GMT
- Title: EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for
Monocular Object Pose Estimation
- Authors: Hansheng Chen, Wei Tian, Pichao Wang, Fan Wang, Lu Xiong, Hao Li
- Abstract summary: Locating 3D objects from a single RGB image via Perspective-n-Point is a long-standing problem in computer vision.
EPro-Scene can enhance existing correspondence networks, closing the gap between MOD-based method and the Line 6DoF pose estimation benchmark.
- Score: 30.212903535850874
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Locating 3D objects from a single RGB image via Perspective-n-Point (PnP) is
a long-standing problem in computer vision. Driven by end-to-end deep learning,
recent studies suggest interpreting PnP as a differentiable layer, allowing for
partial learning of 2D-3D point correspondences by backpropagating the
gradients of pose loss. Yet, learning the entire correspondences from scratch
is highly challenging, particularly for ambiguous pose solutions, where the
globally optimal pose is theoretically non-differentiable w.r.t. the points. In
this paper, we propose the EPro-PnP, a probabilistic PnP layer for general
end-to-end pose estimation, which outputs a distribution of pose with
differentiable probability density on the SE(3) manifold. The 2D-3D coordinates
and corresponding weights are treated as intermediate variables learned by
minimizing the KL divergence between the predicted and target pose
distribution. The underlying principle generalizes previous approaches, and
resembles the attention mechanism. EPro-PnP can enhance existing correspondence
networks, closing the gap between PnP-based method and the task-specific
leaders on the LineMOD 6DoF pose estimation benchmark. Furthermore, EPro-PnP
helps to explore new possibilities of network design, as we demonstrate a novel
deformable correspondence network with the state-of-the-art pose accuracy on
the nuScenes 3D object detection benchmark. Our code is available at
https://github.com/tjiiv-cprg/EPro-PnP-v2.
Related papers
- CheckerPose: Progressive Dense Keypoint Localization for Object Pose
Estimation with Graph Neural Network [66.24726878647543]
Estimating the 6-DoF pose of a rigid object from a single RGB image is a crucial yet challenging task.
Recent studies have shown the great potential of dense correspondence-based solutions.
We propose a novel pose estimation algorithm named CheckerPose, which improves on three main aspects.
arXiv Detail & Related papers (2023-03-29T17:30:53Z) - Diffusion-Based 3D Human Pose Estimation with Multi-Hypothesis
Aggregation [64.874000550443]
A Diffusion-based 3D Pose estimation (D3DP) method with Joint-wise reProjection-based Multi-hypothesis Aggregation (JPMA) is proposed.
The proposed JPMA assembles multiple hypotheses generated by D3DP into a single 3D pose for practical use.
Our method outperforms the state-of-the-art deterministic and probabilistic approaches by 1.5% and 8.9%, respectively.
arXiv Detail & Related papers (2023-03-21T04:00:47Z) - Linear-Covariance Loss for End-to-End Learning of 6D Pose Estimation [64.12149365530624]
Most modern image-based 6D object pose estimation methods learn to predict 2D-3D correspondences, from which the pose can be obtained using a solver.
Here, we argue that this conflicts with the averaging nature of the problem leading to gradients that may encourage the network to degrade accuracy.
arXiv Detail & Related papers (2023-03-21T00:32:31Z) - EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for
Monocular Object Pose Estimation [22.672080094222082]
Locating 3D objects from a single RGB image via Perspective-n-Points is a long-standing problem in computer vision.
Recent studies suggest.
a differentiable layer so that 2D-3D point.
correspondences can be partly learned by backagating the object pose.
Yet the entire set of 2D-3D points from scratch fails to converge with existing approaches.
arXiv Detail & Related papers (2022-03-24T17:59:49Z) - Beyond Weak Perspective for Monocular 3D Human Pose Estimation [6.883305568568084]
We consider the task of 3D joints location and orientation prediction from a monocular video.
We first infer 2D joints locations with an off-the-shelf pose estimation algorithm.
We then adhere to the SMPLify algorithm which receives those initial parameters.
arXiv Detail & Related papers (2020-09-14T16:23:14Z) - Weakly Supervised Generative Network for Multiple 3D Human Pose
Hypotheses [74.48263583706712]
3D human pose estimation from a single image is an inverse problem due to the inherent ambiguity of the missing depth.
We propose a weakly supervised deep generative network to address the inverse problem.
arXiv Detail & Related papers (2020-08-13T09:26:01Z) - Unsupervised Cross-Modal Alignment for Multi-Person 3D Pose Estimation [52.94078950641959]
We present a deployment friendly, fast bottom-up framework for multi-person 3D human pose estimation.
We adopt a novel neural representation of multi-person 3D pose which unifies the position of person instances with their corresponding 3D pose representation.
We propose a practical deployment paradigm where paired 2D or 3D pose annotations are unavailable.
arXiv Detail & Related papers (2020-08-04T07:54:25Z) - Solving the Blind Perspective-n-Point Problem End-To-End With Robust
Differentiable Geometric Optimization [44.85008070868851]
Blind Perspective-n-Point is the problem estimating the position of a camera relative to a scene.
We propose the first fully end-to-end trainable network for solving the blind geometric problem efficiently globally.
arXiv Detail & Related papers (2020-07-29T06:35:45Z) - Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point
Problem [98.92148855291363]
This paper proposes a deep CNN model which simultaneously solves for both 6-DoF absolute camera pose 2D--3D correspondences.
Tests on both real and simulated data have shown that our method substantially outperforms existing approaches.
arXiv Detail & Related papers (2020-03-15T04:17:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.