Attentive Cross-modal Connections for Deep Multimodal Wearable-based
Emotion Recognition
- URL: http://arxiv.org/abs/2108.02241v1
- Date: Wed, 4 Aug 2021 18:40:32 GMT
- Title: Attentive Cross-modal Connections for Deep Multimodal Wearable-based
Emotion Recognition
- Authors: Anubhav Bhatti, Behnam Behinaein, Dirk Rodenburg, Paul Hungler, Ali
Etemad
- Abstract summary: We present a novel attentive cross-modal connection to share information between convolutional neural networks.
Specifically, these connections improve emotion classification by sharing intermediate representations among EDA and ECG.
Our experiments show that the proposed approach is capable of learning strong multimodal representations and outperforms a number of baselines methods.
- Score: 7.559720049837459
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Classification of human emotions can play an essential role in the design and
improvement of human-machine systems. While individual biological signals such
as Electrocardiogram (ECG) and Electrodermal Activity (EDA) have been widely
used for emotion recognition with machine learning methods, multimodal
approaches generally fuse extracted features or final classification/regression
results to boost performance. To enhance multimodal learning, we present a
novel attentive cross-modal connection to share information between
convolutional neural networks responsible for learning individual modalities.
Specifically, these connections improve emotion classification by sharing
intermediate representations among EDA and ECG and apply attention weights to
the shared information, thus learning more effective multimodal embeddings. We
perform experiments on the WESAD dataset to identify the best configuration of
the proposed method for emotion classification. Our experiments show that the
proposed approach is capable of learning strong multimodal representations and
outperforms a number of baselines methods.
Related papers
- Multi-modal Mood Reader: Pre-trained Model Empowers Cross-Subject Emotion Recognition [23.505616142198487]
We develop a Pre-trained model based Multimodal Mood Reader for cross-subject emotion recognition.
The model learns universal latent representations of EEG signals through pre-training on large scale dataset.
Extensive experiments on public datasets demonstrate Mood Reader's superior performance in cross-subject emotion recognition tasks.
arXiv Detail & Related papers (2024-05-28T14:31:11Z) - MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications.
Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders.
We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z) - Joint Multimodal Transformer for Emotion Recognition in the Wild [49.735299182004404]
Multimodal emotion recognition (MMER) systems typically outperform unimodal systems.
This paper proposes an MMER method that relies on a joint multimodal transformer (JMT) for fusion with key-based cross-attention.
arXiv Detail & Related papers (2024-03-15T17:23:38Z) - Multimodal Visual-Tactile Representation Learning through
Self-Supervised Contrastive Pre-Training [0.850206009406913]
MViTac is a novel methodology that leverages contrastive learning to integrate vision and touch sensations in a self-supervised fashion.
By availing both sensory inputs, MViTac leverages intra and inter-modality losses for learning representations, resulting in enhanced material property classification and more adept grasping prediction.
arXiv Detail & Related papers (2024-01-22T15:11:57Z) - Hypercomplex Multimodal Emotion Recognition from EEG and Peripheral
Physiological Signals [7.293063257956068]
We propose a hypercomplex multimodal network equipped with a novel fusion module comprising parameterized hypercomplex multiplications.
We perform classification of valence and arousal from electroencephalogram (EEG) and peripheral physiological signals, employing the publicly available database MAHNOB-HCI.
arXiv Detail & Related papers (2023-10-11T16:45:44Z) - Multimodal Emotion Recognition using Transfer Learning from Speaker
Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities.
We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z) - MEmoBERT: Pre-training Model with Prompt-based Learning for Multimodal
Emotion Recognition [118.73025093045652]
We propose a pre-training model textbfMEmoBERT for multimodal emotion recognition.
Unlike the conventional "pre-train, finetune" paradigm, we propose a prompt-based method that reformulates the downstream emotion classification task as a masked text prediction.
Our proposed MEmoBERT significantly enhances emotion recognition performance.
arXiv Detail & Related papers (2021-10-27T09:57:00Z) - Relational Graph Learning on Visual and Kinematics Embeddings for
Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information.
The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z) - Cross-individual Recognition of Emotions by a Dynamic Entropy based on
Pattern Learning with EEG features [2.863100352151122]
We propose a deep-learning framework denoted as a dynamic entropy-based pattern learning (DEPL) to abstract informative indicators pertaining to the neurophysiological features among multiple individuals.
DEPL enhanced the capability of representations generated by a deep convolutional neural network by modelling the interdependencies between the cortical locations of dynamical entropy based features.
arXiv Detail & Related papers (2020-09-26T07:22:07Z) - Multi-Scale Neural network for EEG Representation Learning in BCI [2.105172041656126]
We propose a novel deep multi-scale neural network that discovers feature representations in multiple frequency/time ranges.
By representing EEG signals withspectral-temporal information, the proposed method can be utilized for diverse paradigms.
arXiv Detail & Related papers (2020-03-02T04:06:47Z) - Unpaired Multi-modal Segmentation via Knowledge Distillation [77.39798870702174]
We propose a novel learning scheme for unpaired cross-modality image segmentation.
In our method, we heavily reuse network parameters, by sharing all convolutional kernels across CT and MRI.
We have extensively validated our approach on two multi-class segmentation problems.
arXiv Detail & Related papers (2020-01-06T20:03:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.