Full-range Head Pose Geometric Data Augmentations
- URL: http://arxiv.org/abs/2408.01566v1
- Date: Fri, 2 Aug 2024 20:41:18 GMT
- Title: Full-range Head Pose Geometric Data Augmentations
- Authors: Huei-Chung Hu, Xuyang Wu, Haowei Liu, Ting-Ruen Wei, Hsin-Tai Wu,
- Abstract summary: Many head pose estimation (HPE) methods promise the ability to create full-range datasets.
These methods are only accurate within a range of head angles; exceeding this specific range led to significant inaccuracies.
Here, we present methods that accurately infer the correct coordinate system and Euler angles in the correct axis-sequence.
- Score: 2.8358100463599722
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many head pose estimation (HPE) methods promise the ability to create full-range datasets, theoretically allowing the estimation of the rotation and positioning of the head from various angles. However, these methods are only accurate within a range of head angles; exceeding this specific range led to significant inaccuracies. This is dominantly explained by unclear specificity of the coordinate systems and Euler Angles used in the foundational rotation matrix calculations. Here, we addressed these limitations by presenting (1) methods that accurately infer the correct coordinate system and Euler angles in the correct axis-sequence, (2) novel formulae for 2D geometric augmentations of the rotation matrices under the (SPECIFIC) coordinate system, (3) derivations for the correct drawing routines for rotation matrices and poses, and (4) mathematical experimentation and verification that allow proper pitch-yaw coverage for full-range head pose dataset generation. Performing our augmentation techniques to existing head pose estimation methods demonstrated a significant improvement to the model performance. Code will be released upon paper acceptance.
Related papers
- 3D Equivariant Pose Regression via Direct Wigner-D Harmonics Prediction [50.07071392673984]
Existing methods learn 3D rotations parametrized in the spatial domain using angles or quaternions.
We propose a frequency-domain approach that directly predicts Wigner-D coefficients for 3D rotation regression.
Our method achieves state-of-the-art results on benchmarks such as ModelNet10-SO(3) and PASCAL3D+.
arXiv Detail & Related papers (2024-11-01T12:50:38Z) - Mathematical Foundation and Corrections for Full Range Head Pose Estimation [16.697345330120744]
It is well-known that rotation matrices depend on coordinate systems, and yaw, roll, and pitch angles are sensitive to their application order.
In this paper, we thoroughly examined the Euler angles defined in the 300W-LP dataset, head pose estimation such as 3DDFA-v2, 6D-RepNet, WHENet, etc, and the validity of their drawing routines of the Euler angles.
When necessary, we infer their coordinate system and sequence of yaw, roll, pitch from provided code.
arXiv Detail & Related papers (2024-03-26T21:04:18Z) - Towards Robust and Unconstrained Full Range of Rotation Head Pose
Estimation [2.915868985330569]
We present a novel method for unconstrained end-to-end head pose estimation.
We propose a continuous 6D rotation matrix representation for efficient and robust direct regression.
Our method significantly outperforms other state-of-the-art methods in an efficient and robust manner.
arXiv Detail & Related papers (2023-09-14T12:17:38Z) - Vanishing Point Estimation in Uncalibrated Images with Prior Gravity
Direction [82.72686460985297]
We tackle the problem of estimating a Manhattan frame.
We derive two new 2-line solvers, one of which does not suffer from singularities affecting existing solvers.
We also design a new non-minimal method, running on an arbitrary number of lines, to boost the performance in local optimization.
arXiv Detail & Related papers (2023-08-21T13:03:25Z) - Category-Level 6D Object Pose Estimation with Flexible Vector-Based
Rotation Representation [51.67545893892129]
We propose a novel 3D graph convolution based pipeline for category-level 6D pose and size estimation from monocular RGB-D images.
We first design an orientation-aware autoencoder with 3D graph convolution for latent feature learning.
Then, to efficiently decode the rotation information from the latent feature, we design a novel flexible vector-based decomposable rotation representation.
arXiv Detail & Related papers (2022-12-09T02:13:43Z) - Detecting Rotated Objects as Gaussian Distributions and Its 3-D
Generalization [81.29406957201458]
Existing detection methods commonly use a parameterized bounding box (BBox) to model and detect (horizontal) objects.
We argue that such a mechanism has fundamental limitations in building an effective regression loss for rotation detection.
We propose to model the rotated objects as Gaussian distributions.
We extend our approach from 2-D to 3-D with a tailored algorithm design to handle the heading estimation.
arXiv Detail & Related papers (2022-09-22T07:50:48Z) - 6D Rotation Representation For Unconstrained Head Pose Estimation [2.1485350418225244]
We address the problem of ambiguous rotation labels by introducing the rotation matrix formalism for our ground truth data.
This way, our method can learn the full rotation appearance which is contrary to previous approaches that restrict the pose prediction to a narrow-angle.
Experiments on the public AFLW2000 and BIWI datasets demonstrate that our proposed method significantly outperforms other state-of-the-art methods by up to 20%.
arXiv Detail & Related papers (2022-02-25T08:41:13Z) - Self-supervised Geometric Perception [96.89966337518854]
Self-supervised geometric perception is a framework to learn a feature descriptor for correspondence matching without any ground-truth geometric model labels.
We show that SGP achieves state-of-the-art performance that is on-par or superior to the supervised oracles trained using ground-truth labels.
arXiv Detail & Related papers (2021-03-04T15:34:43Z) - GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D
Object Pose Estimation [71.83992173720311]
6D pose estimation from a single RGB image is a fundamental task in computer vision.
We propose a simple yet effective Geometry-guided Direct Regression Network (GDR-Net) to learn the 6D pose in an end-to-end manner.
Our approach remarkably outperforms state-of-the-art methods on LM, LM-O and YCB-V datasets.
arXiv Detail & Related papers (2021-02-24T09:11:31Z) - Calculating Pose with Vanishing Points of Visual-Sphere Perspective
Model [0.0]
The goal of the proposed method is to directly obtain a pose matrix of a known rectangular target, without estimation.
This method is specifically tailored for real-time, extreme imaging setups exceeding 180deg field of view, such as a fish-eye camera view.
arXiv Detail & Related papers (2020-04-19T18:39:08Z) - HP2IFS: Head Pose estimation exploiting Partitioned Iterated Function
Systems [18.402636415604373]
Estimating the actual head orientation from 2D images is a well known problem.
We use fractal coding theory and Partitioned Iterated Systems to extract the fractal code from the input head image.
The proposed PIFS based head pose estimation method provides accurate yaw/pitch/roll angular values.
arXiv Detail & Related papers (2020-03-25T17:56:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.