Exploring a Multimodal Fusion-based Deep Learning Network for Detecting Facial Palsy
- URL: http://arxiv.org/abs/2405.16496v1
- Date: Sun, 26 May 2024 09:16:34 GMT
- Title: Exploring a Multimodal Fusion-based Deep Learning Network for Detecting Facial Palsy
- Authors: Nicole Heng Yim Oo, Min Hun Lee, Jeong Hoon Lim,
- Abstract summary: We present a multimodal fusion-based deep learning model that utilizes unstructured data and structured data to detect facial palsy.
Our model slightly improved the precision score to 77.05 at the expense of a decrease in the recall score.
- Score: 3.2381492754749632
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Algorithmic detection of facial palsy offers the potential to improve current practices, which usually involve labor-intensive and subjective assessment by clinicians. In this paper, we present a multimodal fusion-based deep learning model that utilizes unstructured data (i.e. an image frame with facial line segments) and structured data (i.e. features of facial expressions) to detect facial palsy. We then contribute to a study to analyze the effect of different data modalities and the benefits of a multimodal fusion-based approach using videos of 21 facial palsy patients. Our experimental results show that among various data modalities (i.e. unstructured data - RGB images and images of facial line segments and structured data - coordinates of facial landmarks and features of facial expressions), the feed-forward neural network using features of facial expression achieved the highest precision of 76.22 while the ResNet-based model using images of facial line segments achieved the highest recall of 83.47. When we leveraged both images of facial line segments and features of facial expressions, our multimodal fusion-based deep learning model slightly improved the precision score to 77.05 at the expense of a decrease in the recall score.
Related papers
- CFCPalsy: Facial Image Synthesis with Cross-Fusion Cycle Diffusion Model for Facial Paralysis Individuals [3.2688425993442696]
This study aims to synthesize a high-quality facial paralysis dataset to address this gap.
A novel Cross-Fusion Cycle Palsy Expression Generative Model (PalsyCFC) based on the diffusion model is proposed.
We have qualitatively and quantitatively evaluated the proposed method on the commonly used public clinical datasets of facial paralysis.
arXiv Detail & Related papers (2024-09-11T13:46:35Z) - CIAO! A Contrastive Adaptation Mechanism for Non-Universal Facial
Expression Recognition [80.07590100872548]
We propose Contrastive Inhibitory Adaptati On (CIAO), a mechanism that adapts the last layer of facial encoders to depict specific affective characteristics on different datasets.
CIAO presents an improvement in facial expression recognition performance over six different datasets with very unique affective representations.
arXiv Detail & Related papers (2022-08-10T15:46:05Z) - Hybrid Facial Expression Recognition (FER2013) Model for Real-Time
Emotion Classification and Prediction [0.0]
This paper proposes a hybrid model for Facial Expression recognition, which comprises a Deep Convolutional Neural Network (DCNN) and Haar Cascade deep learning architectures.
The objective is to classify real-time and digital facial images into one of the seven facial emotion categories considered.
The experimental results show a significantly improved classification performance compared to state-of-the-art experiments and research.
arXiv Detail & Related papers (2022-06-19T23:43:41Z) - Research on facial expression recognition based on Multimodal data
fusion and neural network [2.5431493111705943]
The algorithm is based on the multimodal data, and it takes the facial image, the histogram of oriented gradient of the image and the facial landmarks as the input.
Experimental results show that, benefiting by the complementarity of multimodal data, the algorithm has a great improvement in accuracy, robustness and detection speed.
arXiv Detail & Related papers (2021-09-26T23:45:40Z) - SynFace: Face Recognition with Synthetic Data [83.15838126703719]
We devise the SynFace with identity mixup (IM) and domain mixup (DM) to mitigate the performance gap.
We also perform a systematically empirical analysis on synthetic face images to provide some insights on how to effectively utilize synthetic data for face recognition.
arXiv Detail & Related papers (2021-08-18T03:41:54Z) - Progressive Spatio-Temporal Bilinear Network with Monte Carlo Dropout
for Landmark-based Facial Expression Recognition with Uncertainty Estimation [93.73198973454944]
The performance of our method is evaluated on three widely used datasets.
It is comparable to that of video-based state-of-the-art methods while it has much less complexity.
arXiv Detail & Related papers (2021-06-08T13:40:30Z) - The FaceChannel: A Fast & Furious Deep Neural Network for Facial
Expression Recognition [71.24825724518847]
Current state-of-the-art models for automatic Facial Expression Recognition (FER) are based on very deep neural networks that are effective but rather expensive to train.
We formalize the FaceChannel, a light-weight neural network that has much fewer parameters than common deep neural networks.
We demonstrate how our model achieves a comparable, if not better, performance to the current state-of-the-art in FER.
arXiv Detail & Related papers (2020-09-15T09:25:37Z) - Ear2Face: Deep Biometric Modality Mapping [9.560980936110234]
We present an end-to-end deep neural network model that learns a mapping between the biometric modalities.
We formulated the problem as a paired image-to-image translation task and collected datasets of ear and face image pairs.
We have achieved very promising results, especially on the FERET dataset, generating visually appealing face images from ear image inputs.
arXiv Detail & Related papers (2020-06-02T21:14:27Z) - Pathological Retinal Region Segmentation From OCT Images Using Geometric
Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape.
The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z) - Joint Deep Learning of Facial Expression Synthesis and Recognition [97.19528464266824]
We propose a novel joint deep learning of facial expression synthesis and recognition method for effective FER.
The proposed method involves a two-stage learning procedure. Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions.
In order to alleviate the problem of data bias between the real images and the synthetic images, we propose an intra-class loss with a novel real data-guided back-propagation (RDBP) algorithm.
arXiv Detail & Related papers (2020-02-06T10:56:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.