Related papers: Multimodal Stress Detection Using Facial Landmarks and Biometric Signals

Multimodal Stress Detection Using Facial Landmarks and Biometric Signals

URL: http://arxiv.org/abs/2311.03606v1
Date: Mon, 6 Nov 2023 23:20:30 GMT
Title: Multimodal Stress Detection Using Facial Landmarks and Biometric Signals
Authors: Majid Hosseini, Morteza Bodaghi, Ravi Teja Bhupatiraju, Anthony Maida, Raju Gottumukkala
Abstract summary: Multi-modal learning aims to capitalize on the strength of each modality rather than relying on a single signal. This paper proposes a multi-modal learning approach for stress detection that integrates facial landmarks and biometric signals.
Score: 1.0124625066746595
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The development of various sensing technologies is improving measurements of stress and the well-being of individuals. Although progress has been made with single signal modalities like wearables and facial emotion recognition, integrating multiple modalities provides a more comprehensive understanding of stress, given that stress manifests differently across different people. Multi-modal learning aims to capitalize on the strength of each modality rather than relying on a single signal. Given the complexity of processing and integrating high-dimensional data from limited subjects, more research is needed. Numerous research efforts have been focused on fusing stress and emotion signals at an early stage, e.g., feature-level fusion using basic machine learning methods and 1D-CNN Methods. This paper proposes a multi-modal learning approach for stress detection that integrates facial landmarks and biometric signals. We test this multi-modal integration with various early-fusion and late-fusion techniques to integrate the 1D-CNN model from biometric signals and 2-D CNN using facial landmarks. We evaluate these architectures using a rigorous test of models' generalizability using the leave-one-subject-out mechanism, i.e., all samples related to a single subject are left out to train the model. Our findings show that late-fusion achieved 94.39\% accuracy, and early-fusion surpassed it with a 98.38\% accuracy rate. This research contributes valuable insights into enhancing stress detection through a multi-modal approach. The proposed research offers important knowledge in improving stress detection using a multi-modal approach.

Related papers

A Few-Shot Metric Learning Method with Dual-Channel Attention for Cross-Modal Same-Neuron Identification [1.3472715366596661]
We propose a few-shot metric learning method with a dual-channel attention mechanism and a pretrained vision transformer to enable robust cross-modal neuron identification. Experiments on two-photon and fMOST datasets demonstrate superior Top-K accuracy and recall compared to existing methods.
arXiv Detail & Related papers (2025-04-23T08:45:23Z)
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM [51.91311158085973]
Methods for detecting AI-generated media have evolved rapidly. General-purpose detectors based on MLLMs integrate authenticity verification, explainability, and localization capabilities. Ethical and security considerations have emerged as critical global concerns.
arXiv Detail & Related papers (2025-02-07T12:18:20Z)
Optimizing Multispectral Object Detection: A Bag of Tricks and Comprehensive Benchmarks [49.84182981950623]
Multispectral object detection, utilizing RGB and TIR (thermal infrared) modalities, is widely recognized as a challenging task. It requires not only the effective extraction of features from both modalities and robust fusion strategies, but also the ability to address issues such as spectral discrepancies. We introduce an efficient and easily deployable multispectral object detection framework that can seamlessly optimize high-performing single-modality models.
arXiv Detail & Related papers (2024-11-27T12:18:39Z)
Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques [0.5172964916120903]
This survey reviews the most relevant methodologies that use deep learning techniques to automate the cognitive decline estimation task. We discuss the key features and advantages of each modality and methodology, including state-of-the-art approaches like Transformer architecture and foundation models. In most cases, the textual modality achieves the best results and is the most relevant for detecting cognitive decline.
arXiv Detail & Related papers (2024-10-24T17:59:21Z)
Promoting cross-modal representations to improve multimodal foundation models for physiological signals [3.630706646160043]
We use a masked autoencoding objective to pretrain a multimodal model. We show that the model learns representations that can be linearly probed for a diverse set of downstream tasks. We argue that explicit methods for inducing cross-modality may enhance multimodal pretraining strategies.
arXiv Detail & Related papers (2024-10-21T18:47:36Z)
RADAR: Robust Two-stage Modality-incomplete Industrial Anomaly Detection [61.71770293720491]
We propose a novel two-stage Robust modAlity-imcomplete fusing and Detecting frAmewoRk, abbreviated as RADAR. Our bootstrapping philosophy is to enhance two stages in MIIAD, improving the robustness of the Multimodal Transformer. Our experimental results demonstrate that the proposed RADAR significantly surpasses conventional MIAD methods in terms of effectiveness and robustness.
arXiv Detail & Related papers (2024-10-02T16:47:55Z)
Advancing Automated Deception Detection: A Multimodal Approach to Feature Extraction and Analysis [0.0]
This research focuses on the extraction and combination of various features to enhance the accuracy of deception detection models. By systematically extracting features from visual, audio, and text data, and experimenting with different combinations, we developed a robust model that achieved an impressive 99% accuracy.
arXiv Detail & Related papers (2024-07-08T14:59:10Z)
Evolving from Single-modal to Multi-modal Facial Deepfake Detection: Progress and Challenges [40.11614155244292]
This survey traces the evolution of deepfake detection from early single-modal methods to sophisticated multi-modal approaches. We present a structured taxonomy of detection techniques and analyze the transition from GAN-based to diffusion model-driven deepfakes.
arXiv Detail & Related papers (2024-06-11T05:48:04Z)
MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications. Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders. We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z)
Diagnosing Alzheimer's Disease using Early-Late Multimodal Data Fusion with Jacobian Maps [1.5501208213584152]
Alzheimer's disease (AD) is a prevalent and debilitating neurodegenerative disorder impacting a large aging population. We propose an efficient early-late fusion (ELF) approach, which leverages a convolutional neural network for automated feature extraction and random forests. To tackle the challenge of detecting subtle changes in brain volume, we transform images into the Jacobian domain (JD)
arXiv Detail & Related papers (2023-10-25T19:02:57Z)
Employing Multimodal Machine Learning for Stress Detection [8.430502131775722]
Mental wellness is one of the most neglected but crucial aspects of today's world. In this work, a multimodal AI-based framework is proposed to monitor a person's working behavior and stress levels.
arXiv Detail & Related papers (2023-06-15T14:34:16Z)
Incomplete Multimodal Learning for Complex Brain Disorders Prediction [65.95783479249745]
We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks. We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative cohort.
arXiv Detail & Related papers (2023-05-25T16:29:16Z)
Multimodal foundation models are better simulators of the human brain [65.10501322822881]
We present a newly-designed multimodal foundation model pre-trained on 15 million image-text pairs. We find that both visual and lingual encoders trained multimodally are more brain-like compared with unimodal ones.
arXiv Detail & Related papers (2022-08-17T12:36:26Z)
Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction [125.18248926508045]
We propose Channel-Exchanging-Network (CEN) which is self-adaptive, parameter-free, and more importantly, applicable for both multimodal fusion and multitask learning. CEN dynamically exchanges channels betweenworks of different modalities. For the application of dense image prediction, the validity of CEN is tested by four different scenarios.
arXiv Detail & Related papers (2021-12-04T05:47:54Z)
Multimodal Categorization of Crisis Events in Social Media [81.07061295887172]
We present a new multimodal fusion method that leverages both images and texts as input. In particular, we introduce a cross-attention module that can filter uninformative and misleading components from weak modalities. We show that our method outperforms the unimodal approaches and strong multimodal baselines by a large margin on three crisis-related tasks.
arXiv Detail & Related papers (2020-04-10T06:31:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.