Multimodal Stress Detection Using Facial Landmarks and Biometric Signals
- URL: http://arxiv.org/abs/2311.03606v1
- Date: Mon, 6 Nov 2023 23:20:30 GMT
- Title: Multimodal Stress Detection Using Facial Landmarks and Biometric Signals
- Authors: Majid Hosseini, Morteza Bodaghi, Ravi Teja Bhupatiraju, Anthony Maida,
Raju Gottumukkala
- Abstract summary: Multi-modal learning aims to capitalize on the strength of each modality rather than relying on a single signal.
This paper proposes a multi-modal learning approach for stress detection that integrates facial landmarks and biometric signals.
- Score: 1.0124625066746595
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The development of various sensing technologies is improving measurements of
stress and the well-being of individuals. Although progress has been made with
single signal modalities like wearables and facial emotion recognition,
integrating multiple modalities provides a more comprehensive understanding of
stress, given that stress manifests differently across different people.
Multi-modal learning aims to capitalize on the strength of each modality rather
than relying on a single signal. Given the complexity of processing and
integrating high-dimensional data from limited subjects, more research is
needed. Numerous research efforts have been focused on fusing stress and
emotion signals at an early stage, e.g., feature-level fusion using basic
machine learning methods and 1D-CNN Methods. This paper proposes a multi-modal
learning approach for stress detection that integrates facial landmarks and
biometric signals. We test this multi-modal integration with various
early-fusion and late-fusion techniques to integrate the 1D-CNN model from
biometric signals and 2-D CNN using facial landmarks. We evaluate these
architectures using a rigorous test of models' generalizability using the
leave-one-subject-out mechanism, i.e., all samples related to a single subject
are left out to train the model. Our findings show that late-fusion achieved
94.39\% accuracy, and early-fusion surpassed it with a 98.38\% accuracy rate.
This research contributes valuable insights into enhancing stress detection
through a multi-modal approach. The proposed research offers important
knowledge in improving stress detection using a multi-modal approach.
Related papers
- Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques [0.5172964916120903]
This survey reviews the most relevant methodologies that use deep learning techniques to automate the cognitive decline estimation task.
We discuss the key features and advantages of each modality and methodology, including state-of-the-art approaches like Transformer architecture and foundation models.
In most cases, the textual modality achieves the best results and is the most relevant for detecting cognitive decline.
arXiv Detail & Related papers (2024-10-24T17:59:21Z) - Promoting cross-modal representations to improve multimodal foundation models for physiological signals [3.630706646160043]
We use a masked autoencoding objective to pretrain a multimodal model.
We show that the model learns representations that can be linearly probed for a diverse set of downstream tasks.
We argue that explicit methods for inducing cross-modality may enhance multimodal pretraining strategies.
arXiv Detail & Related papers (2024-10-21T18:47:36Z) - RADAR: Robust Two-stage Modality-incomplete Industrial Anomaly Detection [61.71770293720491]
We propose a novel two-stage Robust modAlity-imcomplete fusing and Detecting frAmewoRk, abbreviated as RADAR.
Our bootstrapping philosophy is to enhance two stages in MIIAD, improving the robustness of the Multimodal Transformer.
Our experimental results demonstrate that the proposed RADAR significantly surpasses conventional MIAD methods in terms of effectiveness and robustness.
arXiv Detail & Related papers (2024-10-02T16:47:55Z) - Advancing Automated Deception Detection: A Multimodal Approach to Feature Extraction and Analysis [0.0]
This research focuses on the extraction and combination of various features to enhance the accuracy of deception detection models.
By systematically extracting features from visual, audio, and text data, and experimenting with different combinations, we developed a robust model that achieved an impressive 99% accuracy.
arXiv Detail & Related papers (2024-07-08T14:59:10Z) - MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications.
Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders.
We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z) - Diagnosing Alzheimer's Disease using Early-Late Multimodal Data Fusion
with Jacobian Maps [1.5501208213584152]
Alzheimer's disease (AD) is a prevalent and debilitating neurodegenerative disorder impacting a large aging population.
We propose an efficient early-late fusion (ELF) approach, which leverages a convolutional neural network for automated feature extraction and random forests.
To tackle the challenge of detecting subtle changes in brain volume, we transform images into the Jacobian domain (JD)
arXiv Detail & Related papers (2023-10-25T19:02:57Z) - Employing Multimodal Machine Learning for Stress Detection [8.430502131775722]
Mental wellness is one of the most neglected but crucial aspects of today's world.
In this work, a multimodal AI-based framework is proposed to monitor a person's working behavior and stress levels.
arXiv Detail & Related papers (2023-06-15T14:34:16Z) - Incomplete Multimodal Learning for Complex Brain Disorders Prediction [65.95783479249745]
We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks.
We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative cohort.
arXiv Detail & Related papers (2023-05-25T16:29:16Z) - Multimodal foundation models are better simulators of the human brain [65.10501322822881]
We present a newly-designed multimodal foundation model pre-trained on 15 million image-text pairs.
We find that both visual and lingual encoders trained multimodally are more brain-like compared with unimodal ones.
arXiv Detail & Related papers (2022-08-17T12:36:26Z) - Channel Exchanging Networks for Multimodal and Multitask Dense Image
Prediction [125.18248926508045]
We propose Channel-Exchanging-Network (CEN) which is self-adaptive, parameter-free, and more importantly, applicable for both multimodal fusion and multitask learning.
CEN dynamically exchanges channels betweenworks of different modalities.
For the application of dense image prediction, the validity of CEN is tested by four different scenarios.
arXiv Detail & Related papers (2021-12-04T05:47:54Z) - Multimodal Categorization of Crisis Events in Social Media [81.07061295887172]
We present a new multimodal fusion method that leverages both images and texts as input.
In particular, we introduce a cross-attention module that can filter uninformative and misleading components from weak modalities.
We show that our method outperforms the unimodal approaches and strong multimodal baselines by a large margin on three crisis-related tasks.
arXiv Detail & Related papers (2020-04-10T06:31:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.