Continuous Adversarial Text Representation Learning for Affective Recognition
- URL: http://arxiv.org/abs/2502.20613v1
- Date: Fri, 28 Feb 2025 00:29:09 GMT
- Title: Continuous Adversarial Text Representation Learning for Affective Recognition
- Authors: Seungah Son, Andrez Saurez, Dongsoo Har,
- Abstract summary: We propose a novel framework for enhancing emotion-aware embeddings in transformer-based models.<n>Our approach introduces a continuous valence-arousal labeling system to guide contrastive learning.<n>We employ a dynamic token perturbation mechanism, using gradient-based saliency to focus on sentiment-relevant tokens, improving model sensitivity to emotional cues.
- Score: 1.319058156672392
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: While pre-trained language models excel at semantic understanding, they often struggle to capture nuanced affective information critical for affective recognition tasks. To address these limitations, we propose a novel framework for enhancing emotion-aware embeddings in transformer-based models. Our approach introduces a continuous valence-arousal labeling system to guide contrastive learning, which captures subtle and multi-dimensional emotional nuances more effectively. Furthermore, we employ a dynamic token perturbation mechanism, using gradient-based saliency to focus on sentiment-relevant tokens, improving model sensitivity to emotional cues. The experimental results demonstrate that the proposed framework outperforms existing methods, achieving up to 15.5% improvement in the emotion classification benchmark, highlighting the importance of employing continuous labels. This improvement demonstrates that the proposed framework is effective in affective representation learning and enables precise and contextually relevant emotional understanding.
Related papers
- Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation [63.94836524433559]
DICE-Talk is a framework for disentangling identity with emotion and cooperating emotions with similar characteristics.
We develop a disentangled emotion embedder that jointly models audio-visual emotional cues through cross-modal attention.
Second, we introduce a correlation-enhanced emotion conditioning module with learnable Emotion Banks.
Third, we design an emotion discrimination objective that enforces affective consistency during the diffusion process.
arXiv Detail & Related papers (2025-04-25T05:28:21Z) - Emotion Recognition with CLIP and Sequential Learning [5.66758879852618]
We present our innovative methodology for tackling the Valence-Arousal (VA) Estimation Challenge, the Expression Recognition Challenge, and the Action Unit (AU) Detection Challenge.
Our approach introduces a novel framework aimed at enhancing continuous emotion recognition.
arXiv Detail & Related papers (2025-03-13T01:02:06Z) - Learning Frame-Wise Emotion Intensity for Audio-Driven Talking-Head Generation [59.81482518924723]
We propose a method for capturing and generating subtle shifts for talking-head generation.
We develop a talking-head framework that is capable of generating a variety of emotions with precise control over intensity levels.
Experiments and analyses validate the effectiveness of our proposed method.
arXiv Detail & Related papers (2024-09-29T01:02:01Z) - Self-supervised Gait-based Emotion Representation Learning from Selective Strongly Augmented Skeleton Sequences [4.740624855896404]
We propose a contrastive learning framework utilizing selective strong augmentation for self-supervised gait-based emotion representation.
Our approach is validated on the Emotion-Gait (E-Gait) and Emilya datasets and outperforms the state-of-the-art methods under different evaluation protocols.
arXiv Detail & Related papers (2024-05-08T09:13:10Z) - ASEM: Enhancing Empathy in Chatbot through Attention-based Sentiment and
Emotion Modeling [0.0]
We present a novel solution by employing a mixture of experts, multiple encoders, to offer distinct perspectives on the emotional state of the user's utterance.
We propose an end-to-end model architecture called ASEM that performs emotion analysis on top of sentiment analysis for open-domain chatbots.
arXiv Detail & Related papers (2024-02-25T20:36:51Z) - Unifying the Discrete and Continuous Emotion labels for Speech Emotion
Recognition [28.881092401807894]
In paralinguistic analysis for emotion detection from speech, emotions have been identified with discrete or dimensional (continuous-valued) labels.
We propose a model to jointly predict continuous and discrete emotional attributes.
arXiv Detail & Related papers (2022-10-29T16:12:31Z) - Multimodal Emotion Recognition using Transfer Learning from Speaker
Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities.
We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z) - MEmoBERT: Pre-training Model with Prompt-based Learning for Multimodal
Emotion Recognition [118.73025093045652]
We propose a pre-training model textbfMEmoBERT for multimodal emotion recognition.
Unlike the conventional "pre-train, finetune" paradigm, we propose a prompt-based method that reformulates the downstream emotion classification task as a masked text prediction.
Our proposed MEmoBERT significantly enhances emotion recognition performance.
arXiv Detail & Related papers (2021-10-27T09:57:00Z) - Acted vs. Improvised: Domain Adaptation for Elicitation Approaches in
Audio-Visual Emotion Recognition [29.916609743097215]
Key challenges in developing generalized automatic emotion recognition systems include scarcity of labeled data and lack of gold-standard references.
In this work, we regard the emotion elicitation approach as domain knowledge, and explore domain transfer learning techniques on emotional utterances.
arXiv Detail & Related papers (2021-04-05T15:59:31Z) - Reinforcement Learning for Emotional Text-to-Speech Synthesis with
Improved Emotion Discriminability [82.39099867188547]
Emotional text-to-speech synthesis (ETTS) has seen much progress in recent years.
We propose a new interactive training paradigm for ETTS, denoted as i-ETTS.
We formulate an iterative training strategy with reinforcement learning to ensure the quality of i-ETTS optimization.
arXiv Detail & Related papers (2021-04-03T13:52:47Z) - Spectrum-Guided Adversarial Disparity Learning [52.293230153385124]
We propose a novel end-to-end knowledge directed adversarial learning framework.
It portrays the class-conditioned intraclass disparity using two competitive encoding distributions and learns the purified latent codes by denoising learned disparity.
The experiments on four HAR benchmark datasets demonstrate the robustness and generalization of our proposed methods over a set of state-of-the-art.
arXiv Detail & Related papers (2020-07-14T05:46:27Z) - Meta Transfer Learning for Emotion Recognition [42.61707533351803]
We propose a PathNet-based transfer learning method that is able to transfer emotional knowledge learned from one visual/audio emotion domain to another visual/audio emotion domain.
Our proposed system is capable of improving the performance of emotion recognition, making its performance substantially superior to the recent proposed fine-tuning/pre-trained models based transfer learning methods.
arXiv Detail & Related papers (2020-06-23T00:25:28Z) - Continuous Emotion Recognition via Deep Convolutional Autoencoder and
Support Vector Regressor [70.2226417364135]
It is crucial that the machine should be able to recognize the emotional state of the user with high accuracy.
Deep neural networks have been used with great success in recognizing emotions.
We present a new model for continuous emotion recognition based on facial expression recognition.
arXiv Detail & Related papers (2020-01-31T17:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.