A Vector-based Representation to Enhance Head Pose Estimation
- URL: http://arxiv.org/abs/2010.07184v2
- Date: Tue, 8 Dec 2020 21:44:34 GMT
- Title: A Vector-based Representation to Enhance Head Pose Estimation
- Authors: Zhiwen Cao, Zongcheng Chu, Dongfang Liu, Yingjie Chen
- Abstract summary: This paper proposes to use the three vectors in a rotation matrix as the representation in head pose estimation.
We develop a new neural network based on the characteristic of such representation.
- Score: 4.329951775163721
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes to use the three vectors in a rotation matrix as the
representation in head pose estimation and develops a new neural network based
on the characteristic of such representation. We address two potential issues
existed in current head pose estimation works: 1. Public datasets for head pose
estimation use either Euler angles or quaternions to annotate data samples.
However, both of these annotations have the issue of discontinuity and thus
could result in some performance issues in neural network training. 2. Most
research works report Mean Absolute Error (MAE) of Euler angles as the
measurement of performance. We show that MAE may not reflect the actual
behavior especially for the cases of profile views. To solve these two
problems, we propose a new annotation method which uses three vectors to
describe head poses and a new measurement Mean Absolute Error of Vectors (MAEV)
to assess the performance. We also train a new neural network to predict the
three vectors with the constraints of orthogonality. Our proposed method
achieves state-of-the-art results on both AFLW2000 and BIWI datasets.
Experiments show our vector-based annotation method can effectively reduce
prediction errors for large pose angles.
Related papers
- Occlusion Handling in 3D Human Pose Estimation with Perturbed Positional Encoding [15.834419910916933]
We propose a novel positional encoding technique, PerturbPE, that extracts consistent and regular components from the eigenbasis.
Our results support our theoretical findings, e.g. our experimental analysis observed a performance enhancement of up to $12%$ on the Human3.6M dataset.
Our novel approach significantly enhances performance in scenarios where two edges are missing, setting a new benchmark for state-of-the-art.
arXiv Detail & Related papers (2024-05-27T17:48:54Z) - DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses [59.51874686414509]
Current approaches approximate the continuous pose representation with a large number of discrete pose hypotheses.
We present a Deep Voxel Matching Network (DVMNet) that eliminates the need for pose hypotheses and computes the relative object pose in a single pass.
Our method delivers more accurate relative pose estimates for novel objects at a lower computational cost compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-03-20T15:41:32Z) - Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud
Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding.
The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data.
We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z) - Category-Level 6D Object Pose Estimation with Flexible Vector-Based
Rotation Representation [51.67545893892129]
We propose a novel 3D graph convolution based pipeline for category-level 6D pose and size estimation from monocular RGB-D images.
We first design an orientation-aware autoencoder with 3D graph convolution for latent feature learning.
Then, to efficiently decode the rotation information from the latent feature, we design a novel flexible vector-based decomposable rotation representation.
arXiv Detail & Related papers (2022-12-09T02:13:43Z) - An Intuitive and Unconstrained 2D Cube Representation for Simultaneous
Head Detection and Pose Estimation [24.04477340811483]
We present a novel single-stage key-based method via an intuitive and it un 2D cube representation for joint head detection and pose estimation.
Our method achieves comparable results with other representative methods on the AFLW2000 and BIWI datasets.
arXiv Detail & Related papers (2022-12-07T13:28:50Z) - Perspective-1-Ellipsoid: Formulation, Analysis and Solutions of the
Camera Pose Estimation Problem from One Ellipse-Ellipsoid Correspondence [1.7188280334580193]
We introduce an ellipsoid-specific theoretical framework and demonstrate its beneficial properties in the context of pose estimation.
We show that the proposed formalism enables to reduce the pose estimation problem to a position or orientation-only estimation problem.
arXiv Detail & Related papers (2022-08-26T09:15:20Z) - Large-Margin Representation Learning for Texture Classification [67.94823375350433]
This paper presents a novel approach combining convolutional layers (CLs) and large-margin metric learning for training supervised models on small datasets for texture classification.
The experimental results on texture and histopathologic image datasets have shown that the proposed approach achieves competitive accuracy with lower computational cost and faster convergence when compared to equivalent CNNs.
arXiv Detail & Related papers (2022-06-17T04:07:45Z) - On Triangulation as a Form of Self-Supervision for 3D Human Pose
Estimation [57.766049538913926]
Supervised approaches to 3D pose estimation from single images are remarkably effective when labeled data is abundant.
Much of the recent attention has shifted towards semi and (or) weakly supervised learning.
We propose to impose multi-view geometrical constraints by means of a differentiable triangulation and to use it as form of self-supervision during training when no labels are available.
arXiv Detail & Related papers (2022-03-29T19:11:54Z) - 6D Rotation Representation For Unconstrained Head Pose Estimation [2.1485350418225244]
We address the problem of ambiguous rotation labels by introducing the rotation matrix formalism for our ground truth data.
This way, our method can learn the full rotation appearance which is contrary to previous approaches that restrict the pose prediction to a narrow-angle.
Experiments on the public AFLW2000 and BIWI datasets demonstrate that our proposed method significantly outperforms other state-of-the-art methods by up to 20%.
arXiv Detail & Related papers (2022-02-25T08:41:13Z) - HHP-Net: A light Heteroscedastic neural network for Head Pose estimation
with uncertainty [2.064612766965483]
We introduce a novel method to estimate the head pose of people in single images starting from a small set of head keypoints.
Our model is simple to implement and more efficient with respect to the state of the art.
arXiv Detail & Related papers (2021-11-02T08:55:45Z) - Improving Aspect-based Sentiment Analysis with Gated Graph Convolutional
Networks and Syntax-based Regulation [89.38054401427173]
Aspect-based Sentiment Analysis (ABSA) seeks to predict the sentiment polarity of a sentence toward a specific aspect.
dependency trees can be integrated into deep learning models to produce the state-of-the-art performance for ABSA.
We propose a novel graph-based deep learning model to overcome these two issues.
arXiv Detail & Related papers (2020-10-26T07:36:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.