Affective Video Content Analysis: Decade Review and New Perspectives
- URL: http://arxiv.org/abs/2310.17212v2
- Date: Thu, 18 Jan 2024 07:37:32 GMT
- Title: Affective Video Content Analysis: Decade Review and New Perspectives
- Authors: Junxiao Xue, Jie Wang, Xuecheng Wu and Qian Zhang
- Abstract summary: affective video content analysis (AVCA) as an essential branch of affective computing has become a widely researched topic.
We introduce the widely used emotion representation models in AVCA and describe commonly used datasets.
We discuss future challenges and promising research directions, such as emotion recognition and public opinion analysis.
- Score: 4.3569033781023165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video content is rich in semantics and has the ability to evoke various
emotions in viewers. In recent years, with the rapid development of affective
computing and the explosive growth of visual data, affective video content
analysis (AVCA) as an essential branch of affective computing has become a
widely researched topic. In this study, we comprehensively review the
development of AVCA over the past decade, particularly focusing on the most
advanced methods adopted to address the three major challenges of video feature
extraction, expression subjectivity, and multimodal feature fusion. We first
introduce the widely used emotion representation models in AVCA and describe
commonly used datasets. We summarize and compare representative methods in the
following aspects: (1) unimodal AVCA models, including facial expression
recognition and posture emotion recognition; (2) multimodal AVCA models,
including feature fusion, decision fusion, and attention-based multimodal
models; (3) model performance evaluation standards. Finally, we discuss future
challenges and promising research directions, such as emotion recognition and
public opinion analysis, human-computer interaction, and emotional
intelligence.
Related papers
- MEMO-Bench: A Multiple Benchmark for Text-to-Image and Multimodal Large Language Models on Human Emotion Analysis [53.012111671763776]
This study introduces MEMO-Bench, a comprehensive benchmark consisting of 7,145 portraits, each depicting one of six different emotions.
Results demonstrate that existing T2I models are more effective at generating positive emotions than negative ones.
Although MLLMs show a certain degree of effectiveness in distinguishing and recognizing human emotions, they fall short of human-level accuracy.
arXiv Detail & Related papers (2024-11-18T02:09:48Z) - A Novel Energy based Model Mechanism for Multi-modal Aspect-Based
Sentiment Analysis [85.77557381023617]
We propose a novel framework called DQPSA for multi-modal sentiment analysis.
PDQ module uses the prompt as both a visual query and a language query to extract prompt-aware visual information.
EPE module models the boundaries pairing of the analysis target from the perspective of an Energy-based Model.
arXiv Detail & Related papers (2023-12-13T12:00:46Z) - eMotions: A Large-Scale Dataset for Emotion Recognition in Short Videos [7.011656298079659]
The prevailing use of short videos (SVs) leads to the necessity of emotion recognition in SVs.
Considering the lack of SVs emotion data, we introduce a large-scale dataset named eMotions, comprising 27,996 videos.
We present an end-to-end baseline method AV-CPNet that employs the video transformer to better learn semantically relevant representations.
arXiv Detail & Related papers (2023-11-29T03:24:30Z) - A Survey on Video Diffusion Models [103.03565844371711]
The recent wave of AI-generated content (AIGC) has witnessed substantial success in computer vision.
Due to their impressive generative capabilities, diffusion models are gradually superseding methods based on GANs and auto-regressive Transformers.
This paper presents a comprehensive review of video diffusion models in the AIGC era.
arXiv Detail & Related papers (2023-10-16T17:59:28Z) - How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios [73.24092762346095]
We introduce two large-scale datasets with over 60,000 videos annotated for emotional response and subjective wellbeing.
The Video Cognitive Empathy dataset contains annotations for distributions of fine-grained emotional responses, allowing models to gain a detailed understanding of affective states.
The Video to Valence dataset contains annotations of relative pleasantness between videos, which enables predicting a continuous spectrum of wellbeing.
arXiv Detail & Related papers (2022-10-18T17:58:25Z) - Use of Affective Visual Information for Summarization of Human-Centric
Videos [13.273989782771556]
We investigate the affective-information enriched supervised video summarization task for human-centric videos.
First, we train a visual input-driven state-of-the-art continuous emotion recognition model (CER-NET) on the RECOLA dataset to estimate emotional attributes.
Then, we integrate the estimated emotional attributes and the high-level representations from the CER-NET with the visual information to define the proposed affective video summarization architectures (AVSUM)
arXiv Detail & Related papers (2021-07-08T11:46:04Z) - Prior Aided Streaming Network for Multi-task Affective Recognitionat the
2nd ABAW2 Competition [9.188777864190204]
We introduce our submission to the 2nd Affective Behavior Analysis in-the-wild (ABAW2) Competition.
In dealing with different emotion representations, we propose a multi-task streaming network.
We leverage an advanced facial expression embedding as prior knowledge.
arXiv Detail & Related papers (2021-07-08T09:35:08Z) - Affective Image Content Analysis: Two Decades Review and New
Perspectives [132.889649256384]
We will comprehensively review the development of affective image content analysis (AICA) in the recent two decades.
We will focus on the state-of-the-art methods with respect to three main challenges -- the affective gap, perception subjectivity, and label noise and absence.
We discuss some challenges and promising research directions in the future, such as image content and context understanding, group emotion clustering, and viewer-image interaction.
arXiv Detail & Related papers (2021-06-30T15:20:56Z) - Computational Emotion Analysis From Images: Recent Advances and Future
Directions [79.05003998727103]
In this chapter, we aim to introduce image emotion analysis (IEA) from a computational perspective.
We begin with commonly used emotion representation models from psychology.
We then define the key computational problems that the researchers have been trying to solve.
arXiv Detail & Related papers (2021-03-19T13:33:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.