Transformer-Based Self-Supervised Learning for Emotion Recognition
- URL: http://arxiv.org/abs/2204.05103v1
- Date: Fri, 8 Apr 2022 07:14:55 GMT
- Title: Transformer-Based Self-Supervised Learning for Emotion Recognition
- Authors: Juan Vazquez-Rodriguez (M-PSI), Gr\'egoire Lefebvre, Julien Cumin,
James L. Crowley (M-PSI)
- Abstract summary: We propose to use a Transformer-based model to process electrocardiograms (ECG) for emotion recognition.
To overcome the relatively small size of datasets with emotional labels, we employ self-supervised learning.
We show that our approach reaches state-of-the-art performances for emotion recognition using ECG signals on AMIGOS.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In order to exploit representations of time-series signals, such as
physiological signals, it is essential that these representations capture
relevant information from the whole signal. In this work, we propose to use a
Transformer-based model to process electrocardiograms (ECG) for emotion
recognition. Attention mechanisms of the Transformer can be used to build
contextualized representations for a signal, giving more importance to relevant
parts. These representations may then be processed with a fully-connected
network to predict emotions. To overcome the relatively small size of datasets
with emotional labels, we employ self-supervised learning. We gathered several
ECG datasets with no labels of emotion to pre-train our model, which we then
fine-tuned for emotion recognition on the AMIGOS dataset. We show that our
approach reaches state-of-the-art performances for emotion recognition using
ECG signals on AMIGOS. More generally, our experiments show that transformers
and pre-training are promising strategies for emotion recognition with
physiological signals.
Related papers
- Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors [63.194053817609024]
We introduce eye behaviors as an important emotional cues for the creation of a new Eye-behavior-aided Multimodal Emotion Recognition dataset.
For the first time, we provide annotations for both Emotion Recognition (ER) and Facial Expression Recognition (FER) in the EMER dataset.
We specifically design a new EMERT architecture to concurrently enhance performance in both ER and FER.
arXiv Detail & Related papers (2024-11-08T04:53:55Z) - Decoding Human Emotions: Analyzing Multi-Channel EEG Data using LSTM Networks [0.0]
This study aims to understand and improve the predictive accuracy of emotional state classification by applying a Long Short-Term Memory (LSTM) network to analyze EEG signals.
Using a popular dataset of multi-channel EEG recordings known as DEAP, we look towards leveraging LSTM networks' properties to handle temporal dependencies within EEG signal data.
We obtain accuracies of 89.89%, 90.33%, 90.70%, and 90.54% for arousal, valence, dominance, and likeness, respectively, demonstrating significant improvements in emotion recognition model capabilities.
arXiv Detail & Related papers (2024-08-19T18:10:47Z) - A Unified Transformer-based Network for multimodal Emotion Recognition [4.07926531936425]
We present a transformer-based method to classify emotions in an arousal-valence space by combining a 2D representation of an ECG/ signal with the face information.
Our model produces comparable results to the state-of-the-art techniques.
arXiv Detail & Related papers (2023-08-27T17:30:56Z) - CIT-EmotionNet: CNN Interactive Transformer Network for EEG Emotion
Recognition [6.208851183775046]
We propose a novel CNN Interactive Transformer Network for EEG Emotion Recognition, known as CIT-EmotionNet.
We convert raw EEG signals into spatial-frequency representations, which serve as inputs. Then, we integrate Convolutional Neural Network (CNN) and Transformer within a single framework in a parallel manner.
The proposed CIT-EmotionNet outperforms state-of-the-art methods, achieving an average recognition accuracy of 98.57% and 92.09% on two publicly available datasets.
arXiv Detail & Related papers (2023-05-07T16:27:09Z) - EEG2Vec: Learning Affective EEG Representations via Variational
Autoencoders [27.3162026528455]
We explore whether representing neural data, in response to emotional stimuli, in a latent vector space can serve to both predict emotional states.
We propose a conditional variational autoencoder based framework, EEG2Vec, to learn generative-discriminative representations from EEG data.
arXiv Detail & Related papers (2022-07-16T19:25:29Z) - Multimodal Emotion Recognition using Transfer Learning from Speaker
Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities.
We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - A Novel Transferability Attention Neural Network Model for EEG Emotion
Recognition [51.203579838210885]
We propose a transferable attention neural network (TANN) for EEG emotion recognition.
TANN learns the emotional discriminative information by highlighting the transferable EEG brain regions data and samples adaptively.
This can be implemented by measuring the outputs of multiple brain-region-level discriminators and one single sample-level discriminator.
arXiv Detail & Related papers (2020-09-21T02:42:30Z) - Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision
Action Recognition [131.6328804788164]
We propose a framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos)
The SAKDN uses multiple wearable-sensors as teacher modalities and uses RGB videos as student modality.
arXiv Detail & Related papers (2020-09-01T03:38:31Z) - Self-supervised ECG Representation Learning for Emotion Recognition [25.305949034527202]
We exploit a self-supervised deep multi-task learning framework for electrocardiogram (ECG) -based emotion recognition.
We show that the proposed solution considerably improves the performance compared to a network trained using fully-supervised learning.
arXiv Detail & Related papers (2020-02-04T17:15:37Z) - Continuous Emotion Recognition via Deep Convolutional Autoencoder and
Support Vector Regressor [70.2226417364135]
It is crucial that the machine should be able to recognize the emotional state of the user with high accuracy.
Deep neural networks have been used with great success in recognizing emotions.
We present a new model for continuous emotion recognition based on facial expression recognition.
arXiv Detail & Related papers (2020-01-31T17:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.