Related papers: Towards Deeper Emotional Reflection: Crafting Affective Image Filters with Generative Priors

Towards Deeper Emotional Reflection: Crafting Affective Image Filters with Generative Priors

URL: http://arxiv.org/abs/2512.17376v1
Date: Fri, 19 Dec 2025 09:24:22 GMT
Title: Towards Deeper Emotional Reflection: Crafting Affective Image Filters with Generative Priors
Authors: Peixuan Zhang, Shuchen Weng, Jiajun Tang, Si Li, Boxin Shi,
Abstract summary: Social media platforms enable users to express emotions by posting text with accompanying images.<n>We propose the Affective Image Filter (AIF) task, which aims to reflect visually-abstract emotions from text into visually-concrete images.
Score: 60.113589550498176
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Social media platforms enable users to express emotions by posting text with accompanying images. In this paper, we propose the Affective Image Filter (AIF) task, which aims to reflect visually-abstract emotions from text into visually-concrete images, thereby creating emotionally compelling results. We first introduce the AIF dataset and the formulation of the AIF models. Then, we present AIF-B as an initial attempt based on a multi-modal transformer architecture. After that, we propose AIF-D as an extension of AIF-B towards deeper emotional reflection, effectively leveraging generative priors from pre-trained large-scale diffusion models. Quantitative and qualitative experiments demonstrate that AIF models achieve superior performance for both content consistency and emotional fidelity compared to state-of-the-art methods. Extensive user study experiments demonstrate that AIF models are significantly more effective at evoking specific emotions. Based on the presented results, we comprehensively discuss the value and potential of AIF models.

Related papers

Bridging Human Evaluation to Infrared and Visible Image Fusion [54.71406895277533]
Infrared and visible image fusion (IVIF) integrates complementary modalities to enhance scene perception.<n>Current methods predominantly focus on optimizing handcrafted losses and objective metrics, often resulting in fusion outcomes that do not align with human visual preferences.<n>We propose a feedback reinforcement framework that bridges human evaluation to infrared and visible image fusion.
arXiv Detail & Related papers (2026-03-04T09:23:57Z)
Smile on the Face, Sadness in the Eyes: Bridging the Emotion Gap with a Multimodal Dataset of Eye and Facial Behaviors [49.833812625518554]
We introduce eye behaviors as an important emotional cue and construct an Eye-behavior-aided Multimodal Emotion Recognition dataset.<n>In the experiment, we introduce seven multimodal benchmark protocols for a variety of comprehensive evaluations of the EMER dataset.<n>The results show that the EMERT outperforms other state-of-the-art multimodal methods by a great margin, revealing the importance of modeling eye behaviors for robust ER.
arXiv Detail & Related papers (2025-12-18T12:52:55Z)
Analyzing Image Beyond Visual Aspect: Image Emotion Classification via Multiple-Affective Captioning [9.701754879957853]
We propose a novel Affective Captioning for Image Emotion Classification (ACIEC) to classify image emotion based on pure texts.<n>In our method, a hierarchical multi-level contrastive loss is designed for detecting emotional concepts from images, while an emotional chain-of-thought reasoning is proposed to generate affective sentences.<n>Our method can effectively bridge the affective gap and achieve superior results on multiple benchmarks.
arXiv Detail & Related papers (2025-11-28T11:57:39Z)
Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors [63.194053817609024]
We introduce eye behaviors as an important emotional cues for the creation of a new Eye-behavior-aided Multimodal Emotion Recognition dataset. For the first time, we provide annotations for both Emotion Recognition (ER) and Facial Expression Recognition (FER) in the EMER dataset. We specifically design a new EMERT architecture to concurrently enhance performance in both ER and FER.
arXiv Detail & Related papers (2024-11-08T04:53:55Z)
Adaptive Mixed-Scale Feature Fusion Network for Blind AI-Generated Image Quality Assessment [13.998206803073481]
We propose a novel blind image quality assessment (IQA) network, named AMFF-Net, for AGIs. AMFF-Net scales the image up and down and takes the scaled images and original-sized image as the inputs to obtain multi-scale features. We carry out extensive experiments on three AGI quality assessment databases, and the experimental results show that our AMFF-Net obtains better performance than nine state-of-the-art blind IQA methods.
arXiv Detail & Related papers (2024-04-23T16:02:33Z)
Music Recommendation Based on Facial Emotion Recognition [0.0]
This paper presents a comprehensive approach to enhancing the user experience through the integration of emotion recognition, music recommendation, and explainable AI using GRAD-CAM. The proposed methodology utilizes a ResNet50 model trained on the Facial Expression Recognition dataset, consisting of real images of individuals expressing various emotions.
arXiv Detail & Related papers (2024-04-06T15:14:25Z)
Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers [120.49126407479717]
This paper explores text-to-image diffusion models for Zero-Shot Sketch-based Image Retrieval (ZS-SBIR) We highlight a pivotal discovery: the capacity of text-to-image diffusion models to seamlessly bridge the gap between sketches and photos.
arXiv Detail & Related papers (2024-03-12T00:02:03Z)
High-Level Context Representation for Emotion Recognition in Images [4.987022981158291]
We propose an approach for high-level context representation extraction from images. The model relies on a single cue and a single encoding stream to correlate this representation with emotions. Our approach is more efficient than previous models and can be easily deployed to address real-world problems related to emotion recognition.
arXiv Detail & Related papers (2023-05-05T13:20:41Z)
Affective Image Content Analysis: Two Decades Review and New Perspectives [132.889649256384]
We will comprehensively review the development of affective image content analysis (AICA) in the recent two decades. We will focus on the state-of-the-art methods with respect to three main challenges -- the affective gap, perception subjectivity, and label noise and absence. We discuss some challenges and promising research directions in the future, such as image content and context understanding, group emotion clustering, and viewer-image interaction.
arXiv Detail & Related papers (2021-06-30T15:20:56Z)
Interpretable Image Emotion Recognition: A Domain Adaptation Approach Using Facial Expressions [11.808447247077902]
This paper proposes a feature-based domain adaptation technique for identifying emotions in generic images.<n>It addresses the challenge of the limited availability of pre-trained models and well-annotated datasets for Image Emotion Recognition (IER)<n>The proposed IER system demonstrated emotion classification accuracies of 61.86% for the IAPSa dataset, 62.47 for the ArtPhoto dataset, 70.78% for the FI dataset, and 59.72% for the EMOTIC dataset.
arXiv Detail & Related papers (2020-11-17T02:55:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.