Mathematical Foundation and Corrections for Full Range Head Pose Estimation
- URL: http://arxiv.org/abs/2403.18104v2
- Date: Fri, 3 May 2024 22:50:41 GMT
- Title: Mathematical Foundation and Corrections for Full Range Head Pose Estimation
- Authors: Huei-Chung Hu, Xuyang Wu, Yuan Wang, Yi Fang, Hsin-Tai Wu,
- Abstract summary: It is well-known that rotation matrices depend on coordinate systems, and yaw, roll, and pitch angles are sensitive to their application order.
In this paper, we thoroughly examined the Euler angles defined in the 300W-LP dataset, head pose estimation such as 3DDFA-v2, 6D-RepNet, WHENet, etc, and the validity of their drawing routines of the Euler angles.
When necessary, we infer their coordinate system and sequence of yaw, roll, pitch from provided code.
- Score: 16.697345330120744
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Numerous works concerning head pose estimation (HPE) offer algorithms or proposed neural network-based approaches for extracting Euler angles from either facial key points or directly from images of the head region. However, many works failed to provide clear definitions of the coordinate systems and Euler or Tait-Bryan angles orders in use. It is a well-known fact that rotation matrices depend on coordinate systems, and yaw, roll, and pitch angles are sensitive to their application order. Without precise definitions, it becomes challenging to validate the correctness of the output head pose and drawing routines employed in prior works. In this paper, we thoroughly examined the Euler angles defined in the 300W-LP dataset, head pose estimation such as 3DDFA-v2, 6D-RepNet, WHENet, etc, and the validity of their drawing routines of the Euler angles. When necessary, we infer their coordinate system and sequence of yaw, roll, pitch from provided code. This paper presents (1) code and algorithms for inferring coordinate system from provided source code, code for Euler angle application order and extracting precise rotation matrices and the Euler angles, (2) code and algorithms for converting poses from one rotation system to another, (3) novel formulae for 2D augmentations of the rotation matrices, and (4) derivations and code for the correct drawing routines for rotation matrices and poses. This paper also addresses the feasibility of defining rotations with right-handed coordinate system in Wikipedia and SciPy, which makes the Euler angle extraction much easier for full-range head pose research.
Related papers
- Full-range Head Pose Geometric Data Augmentations [2.8358100463599722]
Many head pose estimation (HPE) methods promise the ability to create full-range datasets.
These methods are only accurate within a range of head angles; exceeding this specific range led to significant inaccuracies.
Here, we present methods that accurately infer the correct coordinate system and Euler angles in the correct axis-sequence.
arXiv Detail & Related papers (2024-08-02T20:41:18Z) - CheckerPose: Progressive Dense Keypoint Localization for Object Pose
Estimation with Graph Neural Network [66.24726878647543]
Estimating the 6-DoF pose of a rigid object from a single RGB image is a crucial yet challenging task.
Recent studies have shown the great potential of dense correspondence-based solutions.
We propose a novel pose estimation algorithm named CheckerPose, which improves on three main aspects.
arXiv Detail & Related papers (2023-03-29T17:30:53Z) - Category-Level 6D Object Pose Estimation with Flexible Vector-Based
Rotation Representation [51.67545893892129]
We propose a novel 3D graph convolution based pipeline for category-level 6D pose and size estimation from monocular RGB-D images.
We first design an orientation-aware autoencoder with 3D graph convolution for latent feature learning.
Then, to efficiently decode the rotation information from the latent feature, we design a novel flexible vector-based decomposable rotation representation.
arXiv Detail & Related papers (2022-12-09T02:13:43Z) - An Intuitive and Unconstrained 2D Cube Representation for Simultaneous
Head Detection and Pose Estimation [24.04477340811483]
We present a novel single-stage key-based method via an intuitive and it un 2D cube representation for joint head detection and pose estimation.
Our method achieves comparable results with other representative methods on the AFLW2000 and BIWI datasets.
arXiv Detail & Related papers (2022-12-07T13:28:50Z) - A tutorial on $\mathbf{SE}(3)$ transformation parameterizations and
on-manifold optimization [0.0]
An arbitrary rigid transformation in $mathbfSE(3)$ can be separated into two parts, namely, a translation and a rigid rotation.
This report reviews, under a unifying viewpoint, three common alternatives to representing the rotation part.
It will be described: (i) the equivalence between these representations and the formulas for transforming one to each other.
arXiv Detail & Related papers (2021-03-29T22:43:49Z) - GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D
Object Pose Estimation [71.83992173720311]
6D pose estimation from a single RGB image is a fundamental task in computer vision.
We propose a simple yet effective Geometry-guided Direct Regression Network (GDR-Net) to learn the 6D pose in an end-to-end manner.
Our approach remarkably outperforms state-of-the-art methods on LM, LM-O and YCB-V datasets.
arXiv Detail & Related papers (2021-02-24T09:11:31Z) - An Analysis of SVD for Deep Rotation Estimation [63.97835949897361]
We present a theoretical analysis that shows SVD is the natural choice for projecting onto the rotation group.
Our analysis shows simply replacing existing representations with the SVD orthogonalization procedure obtains state of the art performance in many deep learning applications.
arXiv Detail & Related papers (2020-06-25T17:58:28Z) - Calculating Pose with Vanishing Points of Visual-Sphere Perspective
Model [0.0]
The goal of the proposed method is to directly obtain a pose matrix of a known rectangular target, without estimation.
This method is specifically tailored for real-time, extreme imaging setups exceeding 180deg field of view, such as a fish-eye camera view.
arXiv Detail & Related papers (2020-04-19T18:39:08Z) - HP2IFS: Head Pose estimation exploiting Partitioned Iterated Function
Systems [18.402636415604373]
Estimating the actual head orientation from 2D images is a well known problem.
We use fractal coding theory and Partitioned Iterated Systems to extract the fractal code from the input head image.
The proposed PIFS based head pose estimation method provides accurate yaw/pitch/roll angular values.
arXiv Detail & Related papers (2020-03-25T17:56:45Z) - SeqXY2SeqZ: Structure Learning for 3D Shapes by Sequentially Predicting
1D Occupancy Segments From 2D Coordinates [61.04823927283092]
We propose to represent 3D shapes using 2D functions, where the output of the function at each 2D location is a sequence of line segments inside the shape.
We implement this approach using a Seq2Seq model with attention, called SeqXY2SeqZ, which learns the mapping from a sequence of 2D coordinates along two arbitrary axes to a sequence of 1D locations along the third axis.
Our experiments show that SeqXY2SeqZ outperforms the state-ofthe-art methods under widely used benchmarks.
arXiv Detail & Related papers (2020-03-12T00:24:36Z) - PUGeo-Net: A Geometry-centric Network for 3D Point Cloud Upsampling [103.09504572409449]
We propose a novel deep neural network based method, called PUGeo-Net, to generate uniform dense point clouds.
Thanks to its geometry-centric nature, PUGeo-Net works well for both CAD models with sharp features and scanned models with rich geometric details.
arXiv Detail & Related papers (2020-02-24T14:13:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.