Simultaneous face detection and 360 degree headpose estimation
- URL: http://arxiv.org/abs/2111.11604v1
- Date: Tue, 23 Nov 2021 01:56:10 GMT
- Title: Simultaneous face detection and 360 degree headpose estimation
- Authors: Hoang Nguyen Viet, Linh Nguyen Viet, Tuan Nguyen Dinh, Duc Tran Minh,
Long Tran Quoc
- Abstract summary: We propose the Multitask-Net model to leverage the features extracted from the face detection model.
Applying the multitask learning method, the Multitask-Net model can simultaneously predict the position and direction of the human head.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With many practical applications in human life, including manufacturing
surveillance cameras, analyzing and processing customer behavior, many
researchers are noticing face detection and head pose estimation on digital
images. A large number of proposed deep learning models have state-of-the-art
accuracy such as YOLO, SSD, MTCNN, solving the problem of face detection or
HopeNet, FSA-Net, RankPose model used for head pose estimation problem.
According to many state-of-the-art methods, the pipeline of this task consists
of two parts, from face detection to head pose estimation. These two steps are
completely independent and do not share information. This makes the model clear
in setup but does not leverage most of the featured resources extracted in each
model. In this paper, we proposed the Multitask-Net model with the motivation
to leverage the features extracted from the face detection model, sharing them
with the head pose estimation branch to improve accuracy. Also, with the
variety of data, the Euler angle domain representing the face is large, our
model can predict with results in the 360 Euler angle domain. Applying the
multitask learning method, the Multitask-Net model can simultaneously predict
the position and direction of the human head. To increase the ability to
predict the head direction of the model, we change there presentation of the
human face from the Euler angle to vectors of the Rotation matrix.
Related papers
- SPARK: Self-supervised Personalized Real-time Monocular Face Capture [6.093606972415841]
Current state of the art approaches have the ability to regress parametric 3D face models in real-time across a wide range of identities.
We propose a method for high-precision 3D face capture taking advantage of a collection of unconstrained videos of a subject as prior information.
arXiv Detail & Related papers (2024-09-12T12:30:04Z) - HINT: Learning Complete Human Neural Representations from Limited Viewpoints [69.76947323932107]
We propose a NeRF-based algorithm able to learn a detailed and complete human model from limited viewing angles.
As a result, our method can reconstruct complete humans even from a few viewing angles, increasing performance by more than 15% PSNR.
arXiv Detail & Related papers (2024-05-30T05:43:09Z) - Task-adaptive Q-Face [75.15668556061772]
We propose a novel task-adaptive multi-task face analysis method named as Q-Face.
Q-Face simultaneously performs multiple face analysis tasks with a unified model.
Our method achieves state-of-the-art performance on face expression recognition, action unit detection, face attribute analysis, age estimation, and face pose estimation.
arXiv Detail & Related papers (2024-05-15T03:13:11Z) - FaceXFormer: A Unified Transformer for Facial Analysis [59.94066615853198]
FaceXformer is an end-to-end unified transformer model for a range of facial analysis tasks.
Our model effectively handles images "in-the-wild," demonstrating its robustness and generalizability across eight different tasks.
arXiv Detail & Related papers (2024-03-19T17:58:04Z) - SwinFace: A Multi-task Transformer for Face Recognition, Expression
Recognition, Age Estimation and Attribute Estimation [60.94239810407917]
This paper presents a multi-purpose algorithm for simultaneous face recognition, facial expression recognition, age estimation, and face attribute estimation based on a single Swin Transformer.
To address the conflicts among multiple tasks, a Multi-Level Channel Attention (MLCA) module is integrated into each task-specific analysis.
Experiments show that the proposed model has a better understanding of the face and achieves excellent performance for all tasks.
arXiv Detail & Related papers (2023-08-22T15:38:39Z) - An Effective Deep Network for Head Pose Estimation without Keypoints [0.0]
We propose a lightweight model that effectively addresses the head pose estimation problem.
Our proposed model significantly improves the accuracy in comparison with the state-of-the-art head pose estimation methods.
Our model has the real-time speed of $sim$300 FPS when inferring on Tesla V100.
arXiv Detail & Related papers (2022-10-25T01:57:04Z) - Weakly-Supervised Multi-Face 3D Reconstruction [45.864415499303405]
We propose an effective end-to-end framework for multi-face 3D reconstruction.
We employ the same global camera model for the reconstructed faces in each image, which makes it possible to recover the relative head positions and orientations in the 3D scene.
arXiv Detail & Related papers (2021-01-06T13:15:21Z) - Unsupervised 3D Human Pose Representation with Viewpoint and Pose
Disentanglement [63.853412753242615]
Learning a good 3D human pose representation is important for human pose related tasks.
We propose a novel Siamese denoising autoencoder to learn a 3D pose representation.
Our approach achieves state-of-the-art performance on two inherently different tasks.
arXiv Detail & Related papers (2020-07-14T14:25:22Z) - Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image
Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames.
Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.