A Real-Time Multi-Task Learning System for Joint Detection of Face,
Facial Landmark and Head Pose
- URL: http://arxiv.org/abs/2309.11773v1
- Date: Thu, 21 Sep 2023 04:15:26 GMT
- Title: A Real-Time Multi-Task Learning System for Joint Detection of Face,
Facial Landmark and Head Pose
- Authors: Qingtian Wu and Liming Zhang
- Abstract summary: Extreme head postures pose a common challenge across a spectrum of facial analysis tasks.
This paper focuses on the integration of these tasks, particularly when addressing the complexities posed by large-angle face poses.
The primary contribution of this study is the proposal of a real-time multi-task detection system.
- Score: 3.661587008381534
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Extreme head postures pose a common challenge across a spectrum of facial
analysis tasks, including face detection, facial landmark detection (FLD), and
head pose estimation (HPE). These tasks are interdependent, where accurate FLD
relies on robust face detection, and HPE is intricately associated with these
key points. This paper focuses on the integration of these tasks, particularly
when addressing the complexities posed by large-angle face poses. The primary
contribution of this study is the proposal of a real-time multi-task detection
system capable of simultaneously performing joint detection of faces, facial
landmarks, and head poses. This system builds upon the widely adopted YOLOv8
detection framework. It extends the original object detection head by
incorporating additional landmark regression head, enabling efficient
localization of crucial facial landmarks. Furthermore, we conduct optimizations
and enhancements on various modules within the original YOLOv8 framework. To
validate the effectiveness and real-time performance of our proposed model, we
conduct extensive experiments on 300W-LP and AFLW2000-3D datasets. The results
obtained verify the capability of our model to tackle large-angle face pose
challenges while delivering real-time performance across these interconnected
tasks.
Related papers
- Task-adaptive Q-Face [75.15668556061772]
We propose a novel task-adaptive multi-task face analysis method named as Q-Face.
Q-Face simultaneously performs multiple face analysis tasks with a unified model.
Our method achieves state-of-the-art performance on face expression recognition, action unit detection, face attribute analysis, age estimation, and face pose estimation.
arXiv Detail & Related papers (2024-05-15T03:13:11Z) - Improving Facial Landmark Detection Accuracy and Efficiency with Knowledge Distillation [4.779050216649159]
This paper introduces a novel approach to address these challenges through the development of a knowledge distillation method.
Our goal is to design models capable of accurately locating facial landmarks under varying conditions.
This method was successfully implemented and achieved a top 6th place finish out of 165 participants in the IEEE ICME 2024 PAIR competition.
arXiv Detail & Related papers (2024-04-09T05:30:58Z) - FaceXFormer: A Unified Transformer for Facial Analysis [59.94066615853198]
FaceXformer is an end-to-end unified transformer model for a range of facial analysis tasks.
Our model effectively handles images "in-the-wild," demonstrating its robustness and generalizability across eight different tasks.
arXiv Detail & Related papers (2024-03-19T17:58:04Z) - Faceptor: A Generalist Model for Face Perception [52.8066001012464]
Faceptor is proposed to adopt a well-designed single-encoder dual-decoder architecture.
Layer-Attention into Faceptor enables the model to adaptively select features from optimal layers to perform the desired tasks.
Our training framework can also be applied to auxiliary supervised learning, significantly improving performance in data-sparse tasks such as age estimation and expression recognition.
arXiv Detail & Related papers (2024-03-14T15:42:31Z) - SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection
with Multimodal Large Language Models [63.946809247201905]
We introduce a new benchmark, namely SHIELD, to evaluate the ability of MLLMs on face spoofing and forgery detection.
We design true/false and multiple-choice questions to evaluate multimodal face data in these two face security tasks.
The results indicate that MLLMs hold substantial potential in the face security domain.
arXiv Detail & Related papers (2024-02-06T17:31:36Z) - CLERA: A Unified Model for Joint Cognitive Load and Eye Region Analysis
in the Wild [18.79132232751083]
Real-time analysis of the dynamics of the eye region allows us to monitor humans' visual attention allocation and estimate their mental state.
We propose CLERA, which achieves precise keypoint detection andtemporal tracking in a joint-learning framework.
We also introduce a large-scale dataset of 30k human faces with joint pupil, eye-openness, and landmark annotation.
arXiv Detail & Related papers (2023-06-26T21:20:23Z) - The Devil is in the Task: Exploiting Reciprocal Appearance-Localization
Features for Monocular 3D Object Detection [62.1185839286255]
Low-cost monocular 3D object detection plays a fundamental role in autonomous driving.
We introduce a Dynamic Feature Reflecting Network, named DFR-Net.
We rank 1st among all the monocular 3D object detectors in the KITTI test set.
arXiv Detail & Related papers (2021-12-28T07:31:18Z) - Robust and Precise Facial Landmark Detection by Self-Calibrated Pose
Attention Network [73.56802915291917]
We propose a semi-supervised framework to achieve more robust and precise facial landmark detection.
A Boundary-Aware Landmark Intensity (BALI) field is proposed to model more effective facial shape constraints.
A Self-Calibrated Pose Attention (SCPA) model is designed to provide a self-learned objective function that enforces intermediate supervision.
arXiv Detail & Related papers (2021-12-23T02:51:08Z) - Towards a Real-Time Facial Analysis System [13.649384403827359]
We present a system-level design of a real-time facial analysis system.
With a collection of deep neural networks for object detection, classification, and regression, the system recognizes age, gender, facial expression, and facial similarity for each person that appears in the camera view.
Results on common off-the-shelf architecture show that the system's accuracy is comparable to the state-of-the-art methods, and the recognition speed satisfies real-time requirements.
arXiv Detail & Related papers (2021-09-21T18:27:15Z) - An Efficient Multitask Neural Network for Face Alignment, Head Pose
Estimation and Face Tracking [9.39854778804018]
We propose an efficient multitask face alignment, face tracking and head pose estimation network (ATPN)
ATPN achieves improved performance compared to previous state-of-the-art methods while having less number of parameters and FLOPS.
arXiv Detail & Related papers (2021-03-13T04:41:15Z) - Deep Active Shape Model for Face Alignment and Pose Estimation [0.2148535041822524]
Active Shape Model (ASM) is a statistical model of object shapes that represents a target structure.
This paper presents a lightweight Convolutional Neural Network (CNN) architecture with a loss function regularized by ASM for face alignment and estimating head pose in the wild.
arXiv Detail & Related papers (2021-02-27T03:46:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.