Task-adaptive Q-Face
- URL: http://arxiv.org/abs/2405.09059v1
- Date: Wed, 15 May 2024 03:13:11 GMT
- Title: Task-adaptive Q-Face
- Authors: Haomiao Sun, Mingjie He, Shiguang Shan, Hu Han, Xilin Chen,
- Abstract summary: We propose a novel task-adaptive multi-task face analysis method named as Q-Face.
Q-Face simultaneously performs multiple face analysis tasks with a unified model.
Our method achieves state-of-the-art performance on face expression recognition, action unit detection, face attribute analysis, age estimation, and face pose estimation.
- Score: 75.15668556061772
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although face analysis has achieved remarkable improvements in the past few years, designing a multi-task face analysis model is still challenging. Most face analysis tasks are studied as separate problems and do not benefit from the synergy among related tasks. In this work, we propose a novel task-adaptive multi-task face analysis method named as Q-Face, which simultaneously performs multiple face analysis tasks with a unified model. We fuse the features from multiple layers of a large-scale pre-trained model so that the whole model can use both local and global facial information to support multiple tasks. Furthermore, we design a task-adaptive module that performs cross-attention between a set of query vectors and the fused multi-stage features and finally adaptively extracts desired features for each face analysis task. Extensive experiments show that our method can perform multiple tasks simultaneously and achieves state-of-the-art performance on face expression recognition, action unit detection, face attribute analysis, age estimation, and face pose estimation. Compared to conventional methods, our method opens up new possibilities for multi-task face analysis and shows the potential for both accuracy and efficiency.
Related papers
- FaceXFormer: A Unified Transformer for Facial Analysis [59.94066615853198]
FaceXformer is an end-to-end unified transformer model for a range of facial analysis tasks.
Our model effectively handles images "in-the-wild," demonstrating its robustness and generalizability across eight different tasks.
arXiv Detail & Related papers (2024-03-19T17:58:04Z) - Faceptor: A Generalist Model for Face Perception [52.8066001012464]
Faceptor is proposed to adopt a well-designed single-encoder dual-decoder architecture.
Layer-Attention into Faceptor enables the model to adaptively select features from optimal layers to perform the desired tasks.
Our training framework can also be applied to auxiliary supervised learning, significantly improving performance in data-sparse tasks such as age estimation and expression recognition.
arXiv Detail & Related papers (2024-03-14T15:42:31Z) - SwinFace: A Multi-task Transformer for Face Recognition, Expression
Recognition, Age Estimation and Attribute Estimation [60.94239810407917]
This paper presents a multi-purpose algorithm for simultaneous face recognition, facial expression recognition, age estimation, and face attribute estimation based on a single Swin Transformer.
To address the conflicts among multiple tasks, a Multi-Level Channel Attention (MLCA) module is integrated into each task-specific analysis.
Experiments show that the proposed model has a better understanding of the face and achieves excellent performance for all tasks.
arXiv Detail & Related papers (2023-08-22T15:38:39Z) - Simultaneous face detection and 360 degree headpose estimation [0.0]
We propose the Multitask-Net model to leverage the features extracted from the face detection model.
Applying the multitask learning method, the Multitask-Net model can simultaneously predict the position and direction of the human head.
arXiv Detail & Related papers (2021-11-23T01:56:10Z) - Towards a Real-Time Facial Analysis System [13.649384403827359]
We present a system-level design of a real-time facial analysis system.
With a collection of deep neural networks for object detection, classification, and regression, the system recognizes age, gender, facial expression, and facial similarity for each person that appears in the camera view.
Results on common off-the-shelf architecture show that the system's accuracy is comparable to the state-of-the-art methods, and the recognition speed satisfies real-time requirements.
arXiv Detail & Related papers (2021-09-21T18:27:15Z) - Distribution Matching for Heterogeneous Multi-Task Learning: a
Large-scale Face Study [75.42182503265056]
Multi-Task Learning has emerged as a methodology in which multiple tasks are jointly learned by a shared learning algorithm.
We deal with heterogeneous MTL, simultaneously addressing detection, classification & regression problems.
We build FaceBehaviorNet, the first framework for large-scale face analysis, by jointly learning all facial behavior tasks.
arXiv Detail & Related papers (2021-05-08T22:26:52Z) - MAFER: a Multi-resolution Approach to Facial Expression Recognition [9.878384185493623]
We propose a two-step learning procedure, named MAFER, to train Deep Learning models tasked with recognizing facial expressions.
A relevant feature of MAFER is that it is task-agnostic, i.e., it can be used complementarily to other objective-related techniques.
arXiv Detail & Related papers (2021-05-06T07:26:58Z) - Multi-Task Learning for Dense Prediction Tasks: A Survey [87.66280582034838]
Multi-task learning (MTL) techniques have shown promising results w.r.t. performance, computations and/or memory footprint.
We provide a well-rounded view on state-of-the-art deep learning approaches for MTL in computer vision.
arXiv Detail & Related papers (2020-04-28T09:15:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.