Emotion Recognition with Pre-Trained Transformers Using Multimodal
Signals
- URL: http://arxiv.org/abs/2212.13885v1
- Date: Thu, 22 Dec 2022 14:32:52 GMT
- Title: Emotion Recognition with Pre-Trained Transformers Using Multimodal
Signals
- Authors: Juan Vazquez-Rodriguez (M-PSI), Gr\'egoire Lefebvre, Julien Cumin,
James L Crowley (M-PSI)
- Abstract summary: We demonstrate that a Transformer-based approach is suitable for this task.
We present how such models may be pretrained in a multimodal scenario to improve emotion recognition performances.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we address the problem of multimodal emotion recognition from
multiple physiological signals. We demonstrate that a Transformer-based
approach is suitable for this task. In addition, we present how such models may
be pretrained in a multimodal scenario to improve emotion recognition
performances. We evaluate the benefits of using multimodal inputs and
pre-training with our approach on a state-ofthe-art dataset.
Related papers
- Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition [52.522244807811894]
We propose a novel multimodal Transformer framework using prompt learning to address the issue of missing modalities.
Our method introduces three types of prompts: generative prompts, missing-signal prompts, and missing-type prompts.
Through prompt learning, we achieve a substantial reduction in the number of trainable parameters.
arXiv Detail & Related papers (2024-07-07T13:55:56Z) - MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications.
Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders.
We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z) - Joint Multimodal Transformer for Emotion Recognition in the Wild [49.735299182004404]
Multimodal emotion recognition (MMER) systems typically outperform unimodal systems.
This paper proposes an MMER method that relies on a joint multimodal transformer (JMT) for fusion with key-based cross-attention.
arXiv Detail & Related papers (2024-03-15T17:23:38Z) - Bridging Modalities: Knowledge Distillation and Masked Training for
Translating Multi-Modal Emotion Recognition to Uni-Modal, Speech-Only Emotion
Recognition [0.0]
This paper presents an innovative approach to address the challenges of translating multi-modal emotion recognition models to a more practical uni-modal counterpart.
Recognizing emotions from speech signals is a critical task with applications in human-computer interaction, affective computing, and mental health assessment.
arXiv Detail & Related papers (2024-01-04T22:42:14Z) - Brain encoding models based on multimodal transformers can transfer
across language and vision [60.72020004771044]
We used representations from multimodal transformers to train encoding models that can transfer across fMRI responses to stories and movies.
We found that encoding models trained on brain responses to one modality can successfully predict brain responses to the other modality.
arXiv Detail & Related papers (2023-05-20T17:38:44Z) - Multi-scale Transformer-based Network for Emotion Recognition from Multi
Physiological Signals [11.479653866646762]
This paper presents an efficient Multi-scale Transformer-based approach for the task of Emotion recognition from Physiological data.
Our approach involves applying a Multi-modal technique combined with scaling data to establish the relationship between internal body signals and human emotions.
Our model achieves decent results on the CASE dataset of the EPiC competition, with an RMSE score of 1.45.
arXiv Detail & Related papers (2023-05-01T11:10:48Z) - Multilevel Transformer For Multimodal Emotion Recognition [6.0149102420697025]
We introduce a novel multi-granularity framework, which combines fine-grained representation with pre-trained utterance-level representation.
Inspired by Transformer TTS, we propose a multilevel transformer model to perform fine-grained multimodal emotion recognition.
arXiv Detail & Related papers (2022-10-26T10:31:24Z) - Multimodal Emotion Recognition using Transfer Learning from Speaker
Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities.
We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z) - MEmoBERT: Pre-training Model with Prompt-based Learning for Multimodal
Emotion Recognition [118.73025093045652]
We propose a pre-training model textbfMEmoBERT for multimodal emotion recognition.
Unlike the conventional "pre-train, finetune" paradigm, we propose a prompt-based method that reformulates the downstream emotion classification task as a masked text prediction.
Our proposed MEmoBERT significantly enhances emotion recognition performance.
arXiv Detail & Related papers (2021-10-27T09:57:00Z) - Low Rank Fusion based Transformers for Multimodal Sequences [9.507869508188266]
We present two methods for the Multimodal Sentiment and Emotion Recognition results on CMU-MOSEI, CMU-MOSI, and IEMOCAP datasets.
We show that our models have lesser parameters, train faster and perform comparably to many larger fusion-based architectures.
arXiv Detail & Related papers (2020-07-04T08:05:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.