Related papers: Leveraging Vision Transformers for Enhanced Classification of Emotions using ECG Signals

Leveraging Vision Transformers for Enhanced Classification of Emotions using ECG Signals

URL: http://arxiv.org/abs/2510.05826v1
Date: Tue, 07 Oct 2025 11:49:57 GMT
Title: Leveraging Vision Transformers for Enhanced Classification of Emotions using ECG Signals
Authors: Pubudu L. Indrasiri, Bipasha Kashyap, Pubudu N. Pathirana,
Abstract summary: Biomedical signals offer insights into various conditions affecting the human body.<n> ECG data can reveal changes in heart rate variability linked to emotional arousal, stress levels, and autonomic nervous system activity.<n>Recent advancements in the field diverge from conventional approaches by leveraging the power of advanced transformer architectures.
Score: 1.6018045082682821
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Biomedical signals provide insights into various conditions affecting the human body. Beyond diagnostic capabilities, these signals offer a deeper understanding of how specific organs respond to an individual's emotions and feelings. For instance, ECG data can reveal changes in heart rate variability linked to emotional arousal, stress levels, and autonomic nervous system activity. This data offers a window into the physiological basis of our emotional states. Recent advancements in the field diverge from conventional approaches by leveraging the power of advanced transformer architectures, which surpass traditional machine learning and deep learning methods. We begin by assessing the effectiveness of the Vision Transformer (ViT), a forefront model in image classification, for identifying emotions in imaged ECGs. Following this, we present and evaluate an improved version of ViT, integrating both CNN and SE blocks, aiming to bolster performance on imaged ECGs associated with emotion detection. Our method unfolds in two critical phases: first, we apply advanced preprocessing techniques for signal purification and converting signals into interpretable images using continuous wavelet transform and power spectral density analysis; second, we unveil a performance-boosted vision transformer architecture, cleverly enhanced with convolutional neural network components, to adeptly tackle the challenges of emotion recognition. Our methodology's robustness and innovation were thoroughly tested using ECG data from the YAAD and DREAMER datasets, leading to remarkable outcomes. For the YAAD dataset, our approach outperformed existing state-of-the-art methods in classifying seven unique emotional states, as well as in valence and arousal classification. Similarly, in the DREAMER dataset, our method excelled in distinguishing between valence, arousal and dominance, surpassing current leading techniques.

Related papers

Smile on the Face, Sadness in the Eyes: Bridging the Emotion Gap with a Multimodal Dataset of Eye and Facial Behaviors [49.833812625518554]
We introduce eye behaviors as an important emotional cue and construct an Eye-behavior-aided Multimodal Emotion Recognition dataset.<n>In the experiment, we introduce seven multimodal benchmark protocols for a variety of comprehensive evaluations of the EMER dataset.<n>The results show that the EMERT outperforms other state-of-the-art multimodal methods by a great margin, revealing the importance of modeling eye behaviors for robust ER.
arXiv Detail & Related papers (2025-12-18T12:52:55Z)
Spatial-Functional awareness Transformer-based graph archetype contrastive learning for Decoding Visual Neural Representations from EEG [3.661246946935037]
We propose a Spatial-Functional Awareness Transformer-based Graph Archetype Contrastive Learning (SFTG) framework to enhance EEG-based visual decoding.<n>Specifically, we introduce the EEG Graph Transformer (EGT), a novel graph-based neural architecture that simultaneously encodes spatial brain connectivity and temporal neural dynamics.<n>To mitigate high intra-subject variability, we propose Graph Archetype Contrastive Learning (GAC), which learns subject-specific EEG graph archetypes to improve feature consistency and class separability.
arXiv Detail & Related papers (2025-09-29T13:27:55Z)
CAST-Phys: Contactless Affective States Through Physiological signals Database [74.28082880875368]
The lack of affective multi-modal datasets remains a major bottleneck in developing accurate emotion recognition systems.<n>We present the Contactless Affective States Through Physiological Signals Database (CAST-Phys), a novel high-quality dataset capable of remote physiological emotion recognition.<n>Our analysis highlights the crucial role of physiological signals in realistic scenarios where facial expressions alone may not provide sufficient emotional information.
arXiv Detail & Related papers (2025-07-08T15:20:24Z)
BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals [46.121056431476156]
This paper proposes Brain Omni, the first brain foundation model that generalises across heterogeneous EEG and MEG recordings.<n>Existing approaches typically rely on separate, modality- and dataset-specific models, which limits performance and cross-domain scalability.<n>A total of 1,997 hours of EEG and 656 hours of MEG data are curated and standardised from publicly available sources for pretraining.
arXiv Detail & Related papers (2025-05-18T14:07:14Z)
CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information [61.1904164368732]
We propose CognitionCapturer, a unified framework that fully leverages multimodal data to represent EEG signals.<n>Specifically, CognitionCapturer trains Modality Experts for each modality to extract cross-modal information from the EEG modality.<n>The framework does not require any fine-tuning of the generative models and can be extended to incorporate more modalities.
arXiv Detail & Related papers (2024-12-13T16:27:54Z)
Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors [63.194053817609024]
We introduce eye behaviors as an important emotional cues for the creation of a new Eye-behavior-aided Multimodal Emotion Recognition dataset. For the first time, we provide annotations for both Emotion Recognition (ER) and Facial Expression Recognition (FER) in the EMER dataset. We specifically design a new EMERT architecture to concurrently enhance performance in both ER and FER.
arXiv Detail & Related papers (2024-11-08T04:53:55Z)
A Unified Transformer-based Network for multimodal Emotion Recognition [4.07926531936425]
We present a transformer-based method to classify emotions in an arousal-valence space by combining a 2D representation of an ECG/ signal with the face information. Our model produces comparable results to the state-of-the-art techniques.
arXiv Detail & Related papers (2023-08-27T17:30:56Z)
Transformer-Based Self-Supervised Learning for Emotion Recognition [0.0]
We propose to use a Transformer-based model to process electrocardiograms (ECG) for emotion recognition. To overcome the relatively small size of datasets with emotional labels, we employ self-supervised learning. We show that our approach reaches state-of-the-art performances for emotion recognition using ECG signals on AMIGOS.
arXiv Detail & Related papers (2022-04-08T07:14:55Z)
Progressive Graph Convolution Network for EEG Emotion Recognition [35.08010382523394]
Studies in the area of neuroscience have revealed the relationship between emotional patterns and brain functional regions. In EEG emotion recognition, we can observe that clearer boundaries exist between coarse-grained emotions than those between fine-grained emotions. We propose a progressive graph convolution network (PGCN) for capturing this inherent characteristic in EEG emotional signals.
arXiv Detail & Related papers (2021-12-14T03:30:13Z)
Emotional EEG Classification using Connectivity Features and Convolutional Neural Networks [81.74442855155843]
We introduce a new classification system that utilizes brain connectivity with a CNN and validate its effectiveness via the emotional video classification. The level of concentration of the brain connectivity related to the emotional property of the target video is correlated with classification performance.
arXiv Detail & Related papers (2021-01-18T13:28:08Z)
Self-supervised ECG Representation Learning for Emotion Recognition [25.305949034527202]
We exploit a self-supervised deep multi-task learning framework for electrocardiogram (ECG) -based emotion recognition. We show that the proposed solution considerably improves the performance compared to a network trained using fully-supervised learning.
arXiv Detail & Related papers (2020-02-04T17:15:37Z)
Continuous Emotion Recognition via Deep Convolutional Autoencoder and Support Vector Regressor [70.2226417364135]
It is crucial that the machine should be able to recognize the emotional state of the user with high accuracy. Deep neural networks have been used with great success in recognizing emotions. We present a new model for continuous emotion recognition based on facial expression recognition.
arXiv Detail & Related papers (2020-01-31T17:47:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.