EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model
- URL: http://arxiv.org/abs/2501.05710v1
- Date: Fri, 10 Jan 2025 04:41:37 GMT
- Title: EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model
- Authors: Yi He, Shengqi Dang, Long Ling, Ziqing Qian, Nanxuan Zhao, Nan Cao,
- Abstract summary: We introduce the new task of continuous emotional image content generation (C-EICG)<n>We present EmotiCrafter, an emotional image generation model that generates images based on text prompts and Valence-Arousal values.
- Score: 23.26111054485357
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent research shows that emotions can enhance users' cognition and influence information communication. While research on visual emotion analysis is extensive, limited work has been done on helping users generate emotionally rich image content. Existing work on emotional image generation relies on discrete emotion categories, making it challenging to capture complex and subtle emotional nuances accurately. Additionally, these methods struggle to control the specific content of generated images based on text prompts. In this work, we introduce the new task of continuous emotional image content generation (C-EICG) and present EmotiCrafter, an emotional image generation model that generates images based on text prompts and Valence-Arousal values. Specifically, we propose a novel emotion-embedding mapping network that embeds Valence-Arousal values into textual features, enabling the capture of specific emotions in alignment with intended input prompts. Additionally, we introduce a loss function to enhance emotion expression. The experimental results show that our method effectively generates images representing specific emotions with the desired content and outperforms existing techniques.
Related papers
- Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation [63.94836524433559]
DICE-Talk is a framework for disentangling identity with emotion and cooperating emotions with similar characteristics.
We develop a disentangled emotion embedder that jointly models audio-visual emotional cues through cross-modal attention.
Second, we introduce a correlation-enhanced emotion conditioning module with learnable Emotion Banks.
Third, we design an emotion discrimination objective that enforces affective consistency during the diffusion process.
arXiv Detail & Related papers (2025-04-25T05:28:21Z) - EmoSEM: Segment and Explain Emotion Stimuli in Visual Art [25.539022846134543]
This paper focuses on a key challenge in visual art understanding: given an art image, the model pinpoints pixel regions that trigger a specific human emotion.
Despite recent advances in art understanding, pixel-level emotion understanding still faces a dual challenge.
This paper proposes the Emotion stimuli and Explanation Model (EmoSEM) to endow the segmentation model SAM with emotion comprehension capability.
arXiv Detail & Related papers (2025-04-20T15:40:00Z) - Emotional Images: Assessing Emotions in Images and Potential Biases in Generative Models [0.0]
This paper examines potential biases and inconsistencies in emotional evocation of images produced by generative artificial intelligence (AI) models.<n>We compare the emotions evoked by an AI-produced image to the emotions evoked by prompts used to create those images.<n>Findings indicate that AI-generated images frequently lean toward negative emotional content, regardless of the original prompt.
arXiv Detail & Related papers (2024-11-08T21:42:50Z) - EmoEdit: Evoking Emotions through Image Manipulation [62.416345095776656]
Affective Image Manipulation (AIM) seeks to modify user-provided images to evoke specific emotional responses.
We introduce EmoEdit, which extends AIM by incorporating content modifications to enhance emotional impact.
Our method is evaluated both qualitatively and quantitatively, demonstrating superior performance compared to existing state-of-the-art techniques.
arXiv Detail & Related papers (2024-05-21T10:18:45Z) - Make Me Happier: Evoking Emotions Through Image Diffusion Models [36.40067582639123]
We present a novel challenge of emotion-evoked image generation, aiming to synthesize images that evoke target emotions while retaining the semantics and structures of the original scenes.
Due to the lack of emotion editing datasets, we provide a unique dataset consisting of 340,000 pairs of images and their emotion annotations.
arXiv Detail & Related papers (2024-03-13T05:13:17Z) - EmoGen: Emotional Image Content Generation with Text-to-Image Diffusion
Models [11.901294654242376]
We introduce Emotional Image Content Generation (EICG), a new task to generate semantic-clear and emotion-faithful images given emotion categories.
Specifically, we propose an emotion space and construct a mapping network to align it with the powerful Contrastive Language-Image Pre-training (CLIP) space.
Our method outperforms the state-of-the-art text-to-image approaches both quantitatively and qualitatively.
arXiv Detail & Related papers (2024-01-09T15:23:21Z) - EmoSet: A Large-scale Visual Emotion Dataset with Rich Attributes [53.95428298229396]
We introduce EmoSet, the first large-scale visual emotion dataset annotated with rich attributes.
EmoSet comprises 3.3 million images in total, with 118,102 of these images carefully labeled by human annotators.
Motivated by psychological studies, in addition to emotion category, each image is also annotated with a set of describable emotion attributes.
arXiv Detail & Related papers (2023-07-16T06:42:46Z) - ViNTER: Image Narrative Generation with Emotion-Arc-Aware Transformer [59.05857591535986]
We propose a model called ViNTER to generate image narratives that focus on time series representing varying emotions as "emotion arcs"
We present experimental results of both manual and automatic evaluations.
arXiv Detail & Related papers (2022-02-15T10:53:08Z) - SOLVER: Scene-Object Interrelated Visual Emotion Reasoning Network [83.27291945217424]
We propose a novel Scene-Object interreLated Visual Emotion Reasoning network (SOLVER) to predict emotions from images.
To mine the emotional relationships between distinct objects, we first build up an Emotion Graph based on semantic concepts and visual features.
We also design a Scene-Object Fusion Module to integrate scenes and objects, which exploits scene features to guide the fusion process of object features with the proposed scene-based attention mechanism.
arXiv Detail & Related papers (2021-10-24T02:41:41Z) - Enhancing Cognitive Models of Emotions with Representation Learning [58.2386408470585]
We present a novel deep learning-based framework to generate embedding representations of fine-grained emotions.
Our framework integrates a contextualized embedding encoder with a multi-head probing model.
Our model is evaluated on the Empathetic Dialogue dataset and shows the state-of-the-art result for classifying 32 emotions.
arXiv Detail & Related papers (2021-04-20T16:55:15Z) - Facial Expression Editing with Continuous Emotion Labels [76.36392210528105]
Deep generative models have achieved impressive results in the field of automated facial expression editing.
We propose a model that can be used to manipulate facial expressions in facial images according to continuous two-dimensional emotion labels.
arXiv Detail & Related papers (2020-06-22T13:03:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.