Two-Stream Aural-Visual Affect Analysis in the Wild
- URL: http://arxiv.org/abs/2002.03399v2
- Date: Tue, 3 Mar 2020 13:59:01 GMT
- Title: Two-Stream Aural-Visual Affect Analysis in the Wild
- Authors: Felix Kuhnke, Lars Rumberg, J\"orn Ostermann
- Abstract summary: We introduce our submission to the Affective Behavior Analysis in-the-wild (ABAW) 2020 competition.
We propose a two-stream aural-visual analysis model to recognize affective behavior from videos.
Our model achieves promising results on the challenging Aff-Wild2 database.
- Score: 2.578242050187029
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human affect recognition is an essential part of natural human-computer
interaction. However, current methods are still in their infancy, especially
for in-the-wild data. In this work, we introduce our submission to the
Affective Behavior Analysis in-the-wild (ABAW) 2020 competition. We propose a
two-stream aural-visual analysis model to recognize affective behavior from
videos. Audio and image streams are first processed separately and fed into a
convolutional neural network. Instead of applying recurrent architectures for
temporal analysis we only use temporal convolutions. Furthermore, the model is
given access to additional features extracted during face-alignment. At
training time, we exploit correlations between different emotion
representations to improve performance. Our model achieves promising results on
the challenging Aff-Wild2 database.
Related papers
- Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption [64.07607726562841]
Existing multi-person human reconstruction approaches mainly focus on recovering accurate poses or avoiding penetration.
In this work, we tackle the task of reconstructing closely interactive humans from a monocular video.
We propose to leverage knowledge from proxemic behavior and physics to compensate the lack of visual information.
arXiv Detail & Related papers (2024-04-17T11:55:45Z) - SUN Team's Contribution to ABAW 2024 Competition: Audio-visual Valence-Arousal Estimation and Expression Recognition [8.625751046347139]
This work investigates audiovisual deep learning approaches for emotion recognition in-the-wild problem.
We particularly explore the effectiveness of architectures based on fine-tuned Convolutional Neural Networks (CNN) and Public Dimensional Emotion Model (PDEM)
We compare alternative temporal modeling and fusion strategies using the embeddings from these multi-stage trained modality-specific Deep Neural Networks (DNN)
arXiv Detail & Related papers (2024-03-19T10:24:15Z) - SS-VAERR: Self-Supervised Apparent Emotional Reaction Recognition from
Video [61.21388780334379]
This work focuses on the apparent emotional reaction recognition from the video-only input, conducted in a self-supervised fashion.
The network is first pre-trained on different self-supervised pretext tasks and later fine-tuned on the downstream target task.
arXiv Detail & Related papers (2022-10-20T15:21:51Z) - Differentiable Frequency-based Disentanglement for Aerial Video Action
Recognition [56.91538445510214]
We present a learning algorithm for human activity recognition in videos.
Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras.
We conduct extensive experiments on the UAV Human dataset and the NEC Drone dataset.
arXiv Detail & Related papers (2022-09-15T22:16:52Z) - An Ensemble Approach for Facial Expression Analysis in Video [5.363490780925308]
This paper introduces the Affective Behavior Analysis in-the-wild (ABAW3) 2022 challenge.
The paper focuses on solving the problem of the.
valence-arousal estimation and action unit detection.
arXiv Detail & Related papers (2022-03-24T07:25:23Z) - Overcoming the Domain Gap in Neural Action Representations [60.47807856873544]
3D pose data can now be reliably extracted from multi-view video sequences without manual intervention.
We propose to use it to guide the encoding of neural action representations together with a set of neural and behavioral augmentations.
To reduce the domain gap, during training, we swap neural and behavioral data across animals that seem to be performing similar actions.
arXiv Detail & Related papers (2021-12-02T12:45:46Z) - Efficient Modelling Across Time of Human Actions and Interactions [92.39082696657874]
We argue that current fixed-sized-temporal kernels in 3 convolutional neural networks (CNNDs) can be improved to better deal with temporal variations in the input.
We study how we can better handle between classes of actions, by enhancing their feature differences over different layers of the architecture.
The proposed approaches are evaluated on several benchmark action recognition datasets and show competitive results.
arXiv Detail & Related papers (2021-10-05T15:39:11Z) - Prior Aided Streaming Network for Multi-task Affective Recognitionat the
2nd ABAW2 Competition [9.188777864190204]
We introduce our submission to the 2nd Affective Behavior Analysis in-the-wild (ABAW2) Competition.
In dealing with different emotion representations, we propose a multi-task streaming network.
We leverage an advanced facial expression embedding as prior knowledge.
arXiv Detail & Related papers (2021-07-08T09:35:08Z) - Multi-modal Affect Analysis using standardized data within subjects in
the Wild [8.05417723395965]
We introduce the affective recognition method focusing on facial expression (EXP) and valence-arousal calculation.
Our proposed framework can improve estimation accuracy and robustness effectively.
arXiv Detail & Related papers (2021-07-07T04:18:28Z) - Continuous Emotion Recognition with Spatiotemporal Convolutional Neural
Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild.
We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z) - Adversarial-based neural networks for affect estimations in the wild [3.3335236123901995]
In this work, we explore the use of latent features through our proposed adversarial-based networks for recognition in the wild.
Specifically, our models operate by aggregating several modalities to our discriminator, which is further conditioned to the extracted latent features by the generator.
Our experiments on the recently released SEWA dataset suggest the progressive improvements of our results.
arXiv Detail & Related papers (2020-02-03T16:52:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.