Acted vs. Improvised: Domain Adaptation for Elicitation Approaches in
Audio-Visual Emotion Recognition
- URL: http://arxiv.org/abs/2104.01978v1
- Date: Mon, 5 Apr 2021 15:59:31 GMT
- Title: Acted vs. Improvised: Domain Adaptation for Elicitation Approaches in
Audio-Visual Emotion Recognition
- Authors: Haoqi Li, Yelin Kim, Cheng-Hao Kuo, Shrikanth Narayanan
- Abstract summary: Key challenges in developing generalized automatic emotion recognition systems include scarcity of labeled data and lack of gold-standard references.
In this work, we regard the emotion elicitation approach as domain knowledge, and explore domain transfer learning techniques on emotional utterances.
- Score: 29.916609743097215
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Key challenges in developing generalized automatic emotion recognition
systems include scarcity of labeled data and lack of gold-standard references.
Even for the cues that are labeled as the same emotion category, the
variability of associated expressions can be high depending on the elicitation
context e.g., emotion elicited during improvised conversations vs. acted
sessions with predefined scripts. In this work, we regard the emotion
elicitation approach as domain knowledge, and explore domain transfer learning
techniques on emotional utterances collected under different emotion
elicitation approaches, particularly with limited labeled target samples. Our
emotion recognition model combines the gradient reversal technique with an
entropy loss function as well as the softlabel loss, and the experiment results
show that domain transfer learning methods can be employed to alleviate the
domain mismatch between different elicitation approaches. Our work provides new
insights into emotion data collection, particularly the impact of its
elicitation strategies, and the importance of domain adaptation in emotion
recognition aiming for generalized systems.
Related papers
- Self-supervised Gait-based Emotion Representation Learning from Selective Strongly Augmented Skeleton Sequences [4.740624855896404]
We propose a contrastive learning framework utilizing selective strong augmentation for self-supervised gait-based emotion representation.
Our approach is validated on the Emotion-Gait (E-Gait) and Emilya datasets and outperforms the state-of-the-art methods under different evaluation protocols.
arXiv Detail & Related papers (2024-05-08T09:13:10Z) - Dynamic Causal Disentanglement Model for Dialogue Emotion Detection [77.96255121683011]
We propose a Dynamic Causal Disentanglement Model based on hidden variable separation.
This model effectively decomposes the content of dialogues and investigates the temporal accumulation of emotions.
Specifically, we propose a dynamic temporal disentanglement model to infer the propagation of utterances and hidden variables.
arXiv Detail & Related papers (2023-09-13T12:58:09Z) - Multimodal Emotion Recognition using Transfer Learning from Speaker
Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities.
We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z) - Affect-DML: Context-Aware One-Shot Recognition of Human Affect using
Deep Metric Learning [29.262204241732565]
Existing methods assume that all emotions-of-interest are given a priori as annotated training examples.
We conceptualize one-shot recognition of emotions in context -- a new problem aimed at recognizing human affect states in finer particle level from a single support sample.
All variants of our model clearly outperform the random baseline, while leveraging the semantic scene context consistently improves the learnt representations.
arXiv Detail & Related papers (2021-11-30T10:35:20Z) - Cross Domain Emotion Recognition using Few Shot Knowledge Transfer [21.750633928464026]
Few-shot and zero-shot techniques can generalize across unseen emotions by projecting the documents and emotion labels onto a shared embedding space.
In this work, we explore the task of few-shot emotion recognition by transferring the knowledge gained from supervision on the GoEmotions Reddit dataset to the SemEval tweets corpus.
arXiv Detail & Related papers (2021-10-11T06:22:18Z) - Emotion Recognition from Multiple Modalities: Fundamentals and
Methodologies [106.62835060095532]
We discuss several key aspects of multi-modal emotion recognition (MER)
We begin with a brief introduction on widely used emotion representation models and affective modalities.
We then summarize existing emotion annotation strategies and corresponding computational tasks.
Finally, we outline several real-world applications and discuss some future directions.
arXiv Detail & Related papers (2021-08-18T21:55:20Z) - Emotion-aware Chat Machine: Automatic Emotional Response Generation for
Human-like Emotional Interaction [55.47134146639492]
This article proposes a unifed end-to-end neural architecture, which is capable of simultaneously encoding the semantics and the emotions in a post.
Experiments on real-world data demonstrate that the proposed method outperforms the state-of-the-art methods in terms of both content coherence and emotion appropriateness.
arXiv Detail & Related papers (2021-06-06T06:26:15Z) - Target Guided Emotion Aware Chat Machine [58.8346820846765]
The consistency of a response to a given post at semantic-level and emotional-level is essential for a dialogue system to deliver human-like interactions.
This article proposes a unifed end-to-end neural architecture, which is capable of simultaneously encoding the semantics and the emotions in a post.
arXiv Detail & Related papers (2020-11-15T01:55:37Z) - Temporal aggregation of audio-visual modalities for emotion recognition [0.5352699766206808]
We propose a multimodal fusion technique for emotion recognition based on combining audio-visual modalities from a temporal window with different temporal offsets for each modality.
Our proposed method outperforms other methods from the literature and human accuracy rating.
arXiv Detail & Related papers (2020-07-08T18:44:15Z) - Meta Transfer Learning for Emotion Recognition [42.61707533351803]
We propose a PathNet-based transfer learning method that is able to transfer emotional knowledge learned from one visual/audio emotion domain to another visual/audio emotion domain.
Our proposed system is capable of improving the performance of emotion recognition, making its performance substantially superior to the recent proposed fine-tuning/pre-trained models based transfer learning methods.
arXiv Detail & Related papers (2020-06-23T00:25:28Z) - Emotion Recognition From Gait Analyses: Current Research and Future
Directions [48.93172413752614]
gait conveys information about the walker's emotion.
The mapping between various emotions and gait patterns provides a new source for automated emotion recognition.
gait is remotely observable, more difficult to imitate, and requires less cooperation from the subject.
arXiv Detail & Related papers (2020-03-13T08:22:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.