MOS: A Low Latency and Lightweight Framework for Face Detection,
Landmark Localization, and Head Pose Estimation
- URL: http://arxiv.org/abs/2110.10953v2
- Date: Fri, 22 Oct 2021 02:58:19 GMT
- Title: MOS: A Low Latency and Lightweight Framework for Face Detection,
Landmark Localization, and Head Pose Estimation
- Authors: Yepeng Liu, Zaiwang Gu, Shenghua Gao, Dong Wang, Yusheng Zeng, Jun
Cheng
- Abstract summary: We propose a low latency and lightweight network for simultaneous face detection, landmark localization and head pose estimation.
Inspired by the observation that it is more challenging to locate the facial landmarks for faces with large angles, a pose loss is proposed to constrain the learning.
We also propose an uncertainty multi-task loss to learn the weights of individual tasks automatically.
- Score: 37.537102697992395
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the emergence of service robots and surveillance cameras, dynamic face
recognition (DFR) in wild has received much attention in recent years. Face
detection and head pose estimation are two important steps for DFR. Very often,
the pose is estimated after the face detection. However, such sequential
computations lead to higher latency. In this paper, we propose a low latency
and lightweight network for simultaneous face detection, landmark localization
and head pose estimation. Inspired by the observation that it is more
challenging to locate the facial landmarks for faces with large angles, a pose
loss is proposed to constrain the learning. Moreover, we also propose an
uncertainty multi-task loss to learn the weights of individual tasks
automatically. Another challenge is that robots often use low computational
units like ARM based computing core and we often need to use lightweight
networks instead of the heavy ones, which lead to performance drop especially
for small and hard faces. In this paper, we propose online feedback sampling to
augment the training samples across different scales, which increases the
diversity of training data automatically. Through validation in commonly used
WIDER FACE, AFLW and AFLW2000 datasets, the results show that the proposed
method achieves the state-of-the-art performance in low computational
resources.
Related papers
- UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization.
We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z) - EfficientSRFace: An Efficient Network with Super-Resolution Enhancement
for Accurate Face Detection [18.977044046941813]
In face detection, low-resolution faces, such as numerous small faces of a human group in a crowded scene, are common in dense face prediction tasks.
We develop an efficient detector termed EfficientSRFace by introducing a feature-level super-resolution reconstruction network.
This module plays an auxiliary role in the training process, and can be removed during the inference without increasing the inference time.
arXiv Detail & Related papers (2023-06-04T06:49:44Z) - Multi-Agent Semi-Siamese Training for Long-tail and Shallow Face
Learning [54.13876727413492]
In many real-world scenarios of face recognition, the depth of training dataset is shallow, which means only two face images are available for each ID.
With the non-uniform increase of samples, such issue is converted to a more general case, a.k.a a long-tail face learning.
Based on the Semi-Siamese Training (SST), we introduce an advanced solution, named Multi-Agent Semi-Siamese Training (MASST)
MASST includes a probe network and multiple gallery agents, the former aims to encode the probe features, and the latter constitutes a stack of
arXiv Detail & Related papers (2021-05-10T04:57:32Z) - Locally Aware Piecewise Transformation Fields for 3D Human Mesh
Registration [67.69257782645789]
We propose piecewise transformation fields that learn 3D translation vectors to map any query point in posed space to its correspond position in rest-pose space.
We show that fitting parametric models with poses by our network results in much better registration quality, especially for extreme poses.
arXiv Detail & Related papers (2021-04-16T15:16:09Z) - Facial Masks and Soft-Biometrics: Leveraging Face Recognition CNNs for
Age and Gender Prediction on Mobile Ocular Images [53.913598771836924]
We address the use of selfie ocular images captured with smartphones to estimate age and gender.
We adapt two existing lightweight CNNs proposed in the context of the ImageNet Challenge.
Some networks are further pre-trained for face recognition, for which very large training databases are available.
arXiv Detail & Related papers (2021-03-31T01:48:29Z) - An Efficient Multitask Neural Network for Face Alignment, Head Pose
Estimation and Face Tracking [9.39854778804018]
We propose an efficient multitask face alignment, face tracking and head pose estimation network (ATPN)
ATPN achieves improved performance compared to previous state-of-the-art methods while having less number of parameters and FLOPS.
arXiv Detail & Related papers (2021-03-13T04:41:15Z) - Deep Active Shape Model for Face Alignment and Pose Estimation [0.2148535041822524]
Active Shape Model (ASM) is a statistical model of object shapes that represents a target structure.
This paper presents a lightweight Convolutional Neural Network (CNN) architecture with a loss function regularized by ASM for face alignment and estimating head pose in the wild.
arXiv Detail & Related papers (2021-02-27T03:46:54Z) - The FaceChannel: A Fast & Furious Deep Neural Network for Facial
Expression Recognition [71.24825724518847]
Current state-of-the-art models for automatic Facial Expression Recognition (FER) are based on very deep neural networks that are effective but rather expensive to train.
We formalize the FaceChannel, a light-weight neural network that has much fewer parameters than common deep neural networks.
We demonstrate how our model achieves a comparable, if not better, performance to the current state-of-the-art in FER.
arXiv Detail & Related papers (2020-09-15T09:25:37Z) - An Improved Person Re-identification Method by light-weight
convolutional neural network [0.0]
Person Re-identification is faced with challenges such as low resolution, varying poses, illumination, background clutter, and occlusion.
The present paper aims to improve Person Re-identification using transfer learning and application of verification loss function.
Experiments showed that the proposed model performs better than state-of-the-art methods on the CUHK01 dataset.
arXiv Detail & Related papers (2020-08-21T12:34:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.