Machine vision detection to daily facial fatigue with a nonlocal 3D
attention network
- URL: http://arxiv.org/abs/2104.10420v1
- Date: Wed, 21 Apr 2021 08:58:46 GMT
- Title: Machine vision detection to daily facial fatigue with a nonlocal 3D
attention network
- Authors: Zeyu Chen, Xinhang Zhang, Juan Li, Jingxuan Ni, Gang Chen, Shaohua
Wang, Fangfang Fan, Changfeng Charles Wang, Xiaotao Li
- Abstract summary: This paper provides a dataset named DLFD (daily-life fatigue dataset) which reflected people's facial fatigue state in the wild.
A framework using 3D-ResNet along with non-local attention mechanism was training for extraction of local and long-range features in spatial and temporal dimensions.
Our proposed framework has reached an average accuracy of 90.8% on validation set and 72.5% on test set for binary classification, standing a good position compared to other state-of-the-art methods.
- Score: 10.483447243772128
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Fatigue detection is valued for people to keep mental health and prevent
safety accidents. However, detecting facial fatigue, especially mild fatigue in
the real world via machine vision is still a challenging issue due to lack of
non-lab dataset and well-defined algorithms. In order to improve the detection
capability on facial fatigue that can be used widely in daily life, this paper
provided an audiovisual dataset named DLFD (daily-life fatigue dataset) which
reflected people's facial fatigue state in the wild. A framework using
3D-ResNet along with non-local attention mechanism was training for extraction
of local and long-range features in spatial and temporal dimensions. Then, a
compacted loss function combining mean squared error and cross-entropy was
designed to predict both continuous and categorical fatigue degrees. Our
proposed framework has reached an average accuracy of 90.8% on validation set
and 72.5% on test set for binary classification, standing a good position
compared to other state-of-the-art methods. The analysis of feature map
visualization revealed that our framework captured facial dynamics and
attempted to build a connection with fatigue state. Our experimental results in
multiple metrics proved that our framework captured some typical, micro and
dynamic facial features along spatiotemporal dimensions, contributing to the
mild fatigue detection in the wild.
Related papers
- Uncertainty Estimation for 3D Object Detection via Evidential Learning [63.61283174146648]
We introduce a framework for quantifying uncertainty in 3D object detection by leveraging an evidential learning loss on Bird's Eye View representations in the 3D detector.
We demonstrate both the efficacy and importance of these uncertainty estimates on identifying out-of-distribution scenes, poorly localized objects, and missing (false negative) detections.
arXiv Detail & Related papers (2024-10-31T13:13:32Z) - Exploring Decision-based Black-box Attacks on Face Forgery Detection [53.181920529225906]
Face forgery generation technologies generate vivid faces, which have raised public concerns about security and privacy.
Although face forgery detection has successfully distinguished fake faces, recent studies have demonstrated that face forgery detectors are very vulnerable to adversarial examples.
arXiv Detail & Related papers (2023-10-18T14:49:54Z) - SleepyWheels: An Ensemble Model for Drowsiness Detection leading to
Accident Prevention [0.0]
SleepyWheels is a revolutionary method that uses a lightweight neural network in conjunction with facial landmark identification.
The model is trained on a specially created dataset on driver sleepiness and it achieves an accuracy of 97 percent.
arXiv Detail & Related papers (2022-11-01T19:36:47Z) - MOS: A Low Latency and Lightweight Framework for Face Detection,
Landmark Localization, and Head Pose Estimation [37.537102697992395]
We propose a low latency and lightweight network for simultaneous face detection, landmark localization and head pose estimation.
Inspired by the observation that it is more challenging to locate the facial landmarks for faces with large angles, a pose loss is proposed to constrain the learning.
We also propose an uncertainty multi-task loss to learn the weights of individual tasks automatically.
arXiv Detail & Related papers (2021-10-21T08:05:53Z) - Understanding Cognitive Fatigue from fMRI Scans with Self-supervised
Learning [0.0]
This paper proposes dividing state of cognitive fatigue into six different levels, ranging from no-fatigue to extreme fatigue conditions.
We built a-temporal model that uses convolutional neural networks (CNN) for spatial feature extraction and a long short-term memory (LSTM) network for temporal modeling of 4D fMRI scans.
This method establishes a state-of-the-art technique to analyze cognitive fatigue from fMRI data and beats previous approaches to solve this problem.
arXiv Detail & Related papers (2021-06-28T22:38:51Z) - Progressive Spatio-Temporal Bilinear Network with Monte Carlo Dropout
for Landmark-based Facial Expression Recognition with Uncertainty Estimation [93.73198973454944]
The performance of our method is evaluated on three widely used datasets.
It is comparable to that of video-based state-of-the-art methods while it has much less complexity.
arXiv Detail & Related papers (2021-06-08T13:40:30Z) - Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in
Frequency Domain [88.7339322596758]
We present a novel Spatial-Phase Shallow Learning (SPSL) method, which combines spatial image and phase spectrum to capture the up-sampling artifacts of face forgery.
SPSL can achieve the state-of-the-art performance on cross-datasets evaluation as well as multi-class classification and obtain comparable results on single dataset evaluation.
arXiv Detail & Related papers (2021-03-02T16:45:08Z) - AutoHR: A Strong End-to-end Baseline for Remote Heart Rate Measurement
with Neural Searching [76.4844593082362]
We investigate the reason why existing end-to-end networks perform poorly in challenging conditions and establish a strong baseline for remote HR measurement with architecture search (NAS)
Comprehensive experiments are performed on three benchmark datasets on both intra-temporal and cross-dataset testing.
arXiv Detail & Related papers (2020-04-26T05:43:21Z) - 3D Human Pose Estimation using Spatio-Temporal Networks with Explicit
Occlusion Training [40.933783830017035]
Estimating 3D poses from a monocular task is still a challenging task, despite the significant progress that has been made in recent years.
We introduce a-temporal video network for robust 3D human pose estimation.
We apply multi-scale spatial features for 2D joints or keypoints prediction in each individual frame, and multistride temporal convolutional net-works (TCNs) to estimate 3D joints or keypoints.
arXiv Detail & Related papers (2020-04-07T09:12:12Z) - TimeConvNets: A Deep Time Windowed Convolution Neural Network Design for
Real-time Video Facial Expression Recognition [93.0013343535411]
This study explores a novel deep time windowed convolutional neural network design (TimeConvNets) for the purpose of real-time video facial expression recognition.
We show that TimeConvNets can better capture the transient nuances of facial expressions and boost classification accuracy while maintaining a low inference time.
arXiv Detail & Related papers (2020-03-03T20:58:52Z) - Lossless Attention in Convolutional Networks for Facial Expression
Recognition in the Wild [26.10189921938026]
We propose a Lossless Attention Model (LLAM) for convolutional neural networks (CNN) to extract attention-aware features from faces.
We participate in the seven basic expression classification sub-challenges of FG-2020 Affective Behavior Analysis in-the-wild Challenge.
And we validate our method on the Aff-Wild2 datasets released by the Challenge.
arXiv Detail & Related papers (2020-01-31T14:38:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.