Related papers: Interpretable Multimodal Emotion Recognition using Facial Features and Physiological Signals

Related papers

Bridging Cognition and Emotion: Empathy-Driven Multimodal Misinformation Detection [56.644686934050576]
Social media has become a major conduit for information dissemination, yet it also facilitates the rapid spread of misinformation. Traditional misinformation detection methods primarily focus on surface-level features, overlooking the crucial roles of human empathy in the propagation process. We propose the Dual-Aspect Empathy Framework (DAE), which integrates cognitive and emotional empathy to analyze misinformation from both the creator and reader perspectives.
arXiv Detail & Related papers (2025-04-24T07:48:26Z)
Contrastive Decoupled Representation Learning and Regularization for Speech-Preserving Facial Expression Manipulation [58.189703277322224]
Speech-preserving facial expression manipulation (SPFEM) aims to modify a talking head to display a specific reference emotion. Emotion and content information existing in reference and source inputs can provide direct and accurate supervision signals for SPFEM models. We propose to learn content and emotion priors as guidance augmented with contrastive learning to learn decoupled content and emotion representation.
arXiv Detail & Related papers (2025-04-08T04:34:38Z)
Milmer: a Framework for Multiple Instance Learning based Multimodal Emotion Recognition [16.616341358877243]
This study addresses the challenges of emotion recognition by integrating facial expression analysis with electroencephalogram (EEG) signals. The proposed framework employs a transformer-based fusion approach to effectively integrate visual and physiological modalities. A key innovation of this work is the adoption of a multiple instance learning (MIL) approach, which extracts meaningful information from multiple facial expression images.
arXiv Detail & Related papers (2025-02-01T20:32:57Z)
Enriching Multimodal Sentiment Analysis through Textual Emotional Descriptions of Visual-Audio Content [56.62027582702816]
Multimodal Sentiment Analysis seeks to unravel human emotions by amalgamating text, audio, and visual data. Yet, discerning subtle emotional nuances within audio and video expressions poses a formidable challenge. We introduce DEVA, a progressive fusion framework founded on textual sentiment descriptions.
arXiv Detail & Related papers (2024-12-12T11:30:41Z)
PSVMA+: Exploring Multi-granularity Semantic-visual Adaption for Generalized Zero-shot Learning [116.33775552866476]
Generalized zero-shot learning (GZSL) endeavors to identify the unseen using knowledge from the seen domain. GZSL suffers from insufficient visual-semantic correspondences due to attribute diversity and instance diversity. We propose a multi-granularity progressive semantic-visual adaption network, where sufficient visual elements can be gathered to remedy the inconsistency.
arXiv Detail & Related papers (2024-10-15T12:49:33Z)
MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning [48.97640824497327]
We propose a novel framework leveraging domain-specific medical knowledge as guiding signals to integrate language information into the visual domain through image-text contrastive learning. Our model includes global contrastive learning with our designed divergence encoder, local token-knowledge-patch alignment contrastive learning, and knowledge-guided category-level contrastive learning with expert knowledge. Notably, MLIP surpasses state-of-the-art methods even with limited annotated data, highlighting the potential of multimodal pre-training in advancing medical representation learning.
arXiv Detail & Related papers (2024-02-03T05:48:50Z)
Adversarial Representation with Intra-Modal and Inter-Modal Graph Contrastive Learning for Multimodal Emotion Recognition [14.639340916340801]
We propose a novel Adversarial Representation with Intra-Modal and Inter-Modal Graph Contrastive for Multimodal Emotion Recognition (AR-IIGCN) method. Firstly, we input video, audio, and text features into a multi-layer perceptron (MLP) to map them into separate feature spaces. Secondly, we build a generator and a discriminator for the three modal features through adversarial representation. Thirdly, we introduce contrastive graph representation learning to capture intra-modal and inter-modal complementary semantic information.
arXiv Detail & Related papers (2023-12-28T01:57:26Z)
EMERSK -- Explainable Multimodal Emotion Recognition with Situational Knowledge [0.0]
We present Explainable Multimodal Emotion Recognition with Situational Knowledge (EMERSK) EMERSK is a general system for human emotion recognition and explanation using visual information. Our system can handle multiple modalities, including facial expressions, posture, and gait in a flexible and modular manner.
arXiv Detail & Related papers (2023-06-14T17:52:37Z)
Interpretable Multimodal Emotion Recognition using Hybrid Fusion of Speech and Image Data [15.676632465869346]
A new interpretability technique has been developed to identify the important speech & image features leading to the prediction of particular emotion classes. The proposed system has achieved 83.29% accuracy for emotion recognition.
arXiv Detail & Related papers (2022-08-25T04:43:34Z)
Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities. We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z)
Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features [11.48760300147023]
This paper introduces a novel method, called Multi-Almod Text Recognition Network (MATRN) MATRN identifies visual and semantic feature pairs and encodes spatial information into semantic features. Our experiments demonstrate that MATRN achieves state-of-the-art performances on seven benchmarks with large margins.
arXiv Detail & Related papers (2021-11-30T10:22:11Z)
MEmoBERT: Pre-training Model with Prompt-based Learning for Multimodal Emotion Recognition [118.73025093045652]
We propose a pre-training model textbfMEmoBERT for multimodal emotion recognition. Unlike the conventional "pre-train, finetune" paradigm, we propose a prompt-based method that reformulates the downstream emotion classification task as a masked text prediction. Our proposed MEmoBERT significantly enhances emotion recognition performance.
arXiv Detail & Related papers (2021-10-27T09:57:00Z)
Attentive Cross-modal Connections for Deep Multimodal Wearable-based Emotion Recognition [7.559720049837459]
We present a novel attentive cross-modal connection to share information between convolutional neural networks. Specifically, these connections improve emotion classification by sharing intermediate representations among EDA and ECG. Our experiments show that the proposed approach is capable of learning strong multimodal representations and outperforms a number of baselines methods.
arXiv Detail & Related papers (2021-08-04T18:40:32Z)
Emotion pattern detection on facial videos using functional statistics [62.997667081978825]
We propose a technique based on Functional ANOVA to extract significant patterns of face muscles movements. We determine if there are time-related differences on expressions among emotional groups by using a functional F-test.
arXiv Detail & Related papers (2021-03-01T08:31:08Z)
Continuous Emotion Recognition via Deep Convolutional Autoencoder and Support Vector Regressor [70.2226417364135]
It is crucial that the machine should be able to recognize the emotional state of the user with high accuracy. Deep neural networks have been used with great success in recognizing emotions. We present a new model for continuous emotion recognition based on facial expression recognition.
arXiv Detail & Related papers (2020-01-31T17:47:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.