An Effective Deep Network for Head Pose Estimation without Keypoints
- URL: http://arxiv.org/abs/2210.13705v1
- Date: Tue, 25 Oct 2022 01:57:04 GMT
- Title: An Effective Deep Network for Head Pose Estimation without Keypoints
- Authors: Chien Thai and Viet Tran and Minh Bui and Huong Ninh and Hai Tran
- Abstract summary: We propose a lightweight model that effectively addresses the head pose estimation problem.
Our proposed model significantly improves the accuracy in comparison with the state-of-the-art head pose estimation methods.
Our model has the real-time speed of $sim$300 FPS when inferring on Tesla V100.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Human head pose estimation is an essential problem in facial analysis in
recent years that has a lot of computer vision applications such as gaze
estimation, virtual reality, and driver assistance. Because of the importance
of the head pose estimation problem, it is necessary to design a compact model
to resolve this task in order to reduce the computational cost when deploying
on facial analysis-based applications such as large camera surveillance
systems, AI cameras while maintaining accuracy. In this work, we propose a
lightweight model that effectively addresses the head pose estimation problem.
Our approach has two main steps. 1) We first train many teacher models on the
synthesis dataset - 300W-LPA to get the head pose pseudo labels. 2) We design
an architecture with the ResNet18 backbone and train our proposed model with
the ensemble of these pseudo labels via the knowledge distillation process. To
evaluate the effectiveness of our model, we use AFLW-2000 and BIWI - two
real-world head pose datasets. Experimental results show that our proposed
model significantly improves the accuracy in comparison with the
state-of-the-art head pose estimation methods. Furthermore, our model has the
real-time speed of $\sim$300 FPS when inferring on Tesla V100.
Related papers
- Efficient Verification-Based Face Identification [50.616875565173274]
We study the problem of performing face verification with an efficient neural model $f$.
Our model leads to a substantially small $f$ requiring only 23k parameters and 5M floating point operations (FLOPS)
We use six face verification datasets to demonstrate that our method is on par or better than state-of-the-art models.
arXiv Detail & Related papers (2023-12-20T18:08:02Z) - A Simple and Efficient Baseline for Data Attribution on Images [107.12337511216228]
Current state-of-the-art approaches require a large ensemble of as many as 300,000 models to accurately attribute model predictions.
In this work, we focus on a minimalist baseline, utilizing the feature space of a backbone pretrained via self-supervised learning to perform data attribution.
Our method is model-agnostic and scales easily to large datasets.
arXiv Detail & Related papers (2023-11-03T17:29:46Z) - Robust Category-Level 3D Pose Estimation from Synthetic Data [17.247607850702558]
We introduce SyntheticP3D, a new synthetic dataset for object pose estimation generated from CAD models.
We propose a novel approach (CC3D) for training neural mesh models that perform pose estimation via inverse rendering.
arXiv Detail & Related papers (2023-05-25T14:56:03Z) - ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation [76.35955924137986]
We show that a plain vision transformer with MAE pretraining can obtain superior performance after finetuning on human pose estimation datasets.
Our biggest ViTPose model based on the ViTAE-G backbone with 1 billion parameters obtains the best 80.9 mAP on the MS COCO test-dev set.
arXiv Detail & Related papers (2022-04-26T17:55:04Z) - Zero-Shot Category-Level Object Pose Estimation [24.822189326540105]
We tackle the problem of estimating the pose of novel object categories in a zero-shot manner.
This extends much of the existing literature by removing the need for pose-labelled datasets or category-specific CAD models.
Our method provides a six-fold improvement in average rotation accuracy at 30 degrees.
arXiv Detail & Related papers (2022-04-07T17:58:39Z) - Simultaneous face detection and 360 degree headpose estimation [0.0]
We propose the Multitask-Net model to leverage the features extracted from the face detection model.
Applying the multitask learning method, the Multitask-Net model can simultaneously predict the position and direction of the human head.
arXiv Detail & Related papers (2021-11-23T01:56:10Z) - HHP-Net: A light Heteroscedastic neural network for Head Pose estimation
with uncertainty [2.064612766965483]
We introduce a novel method to estimate the head pose of people in single images starting from a small set of head keypoints.
Our model is simple to implement and more efficient with respect to the state of the art.
arXiv Detail & Related papers (2021-11-02T08:55:45Z) - When Liebig's Barrel Meets Facial Landmark Detection: A Practical Model [87.25037167380522]
We propose a model that is accurate, robust, efficient, generalizable, and end-to-end trainable.
In order to achieve a better accuracy, we propose two lightweight modules.
DQInit dynamically initializes the queries of decoder from the inputs, enabling the model to achieve as good accuracy as the ones with multiple decoder layers.
QAMem is designed to enhance the discriminative ability of queries on low-resolution feature maps by assigning separate memory values to each query rather than a shared one.
arXiv Detail & Related papers (2021-05-27T13:51:42Z) - EfficientPose: Efficient Human Pose Estimation with Neural Architecture
Search [47.30243595690131]
We propose an efficient framework targeted at human pose estimation including two parts, the efficient backbone and the efficient head.
Our smallest model has only 0.65 GFLOPs with 88.1% PCKh@0.5 on MPII and our large model has only 2 GFLOPs while its accuracy is competitive with the state-of-the-art large model.
arXiv Detail & Related papers (2020-12-13T15:38:38Z) - Fast Uncertainty Quantification for Deep Object Pose Estimation [91.09217713805337]
Deep learning-based object pose estimators are often unreliable and overconfident.
In this work, we propose a simple, efficient, and plug-and-play UQ method for 6-DoF object pose estimation.
arXiv Detail & Related papers (2020-11-16T06:51:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.