EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model
- URL: http://arxiv.org/abs/2501.05710v1
- Date: Fri, 10 Jan 2025 04:41:37 GMT
- Title: EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model
- Authors: Yi He, Shengqi Dang, Long Ling, Ziqing Qian, Nanxuan Zhao, Nan Cao,
- Abstract summary: We introduce the new task of continuous emotional image content generation (C-EICG)
We present EmotiCrafter, an emotional image generation model that generates images based on text prompts and Valence-Arousal values.
- Score: 23.26111054485357
- License:
- Abstract: Recent research shows that emotions can enhance users' cognition and influence information communication. While research on visual emotion analysis is extensive, limited work has been done on helping users generate emotionally rich image content. Existing work on emotional image generation relies on discrete emotion categories, making it challenging to capture complex and subtle emotional nuances accurately. Additionally, these methods struggle to control the specific content of generated images based on text prompts. In this work, we introduce the new task of continuous emotional image content generation (C-EICG) and present EmotiCrafter, an emotional image generation model that generates images based on text prompts and Valence-Arousal values. Specifically, we propose a novel emotion-embedding mapping network that embeds Valence-Arousal values into textual features, enabling the capture of specific emotions in alignment with intended input prompts. Additionally, we introduce a loss function to enhance emotion expression. The experimental results show that our method effectively generates images representing specific emotions with the desired content and outperforms existing techniques.
Related papers
- Emotional Images: Assessing Emotions in Images and Potential Biases in Generative Models [0.0]
This paper examines potential biases and inconsistencies in emotional evocation of images produced by generative artificial intelligence (AI) models.
We compare the emotions evoked by an AI-produced image to the emotions evoked by prompts used to create those images.
Findings indicate that AI-generated images frequently lean toward negative emotional content, regardless of the original prompt.
arXiv Detail & Related papers (2024-11-08T21:42:50Z) - Make Me Happier: Evoking Emotions Through Image Diffusion Models [36.40067582639123]
We present a novel challenge of emotion-evoked image generation, aiming to synthesize images that evoke target emotions while retaining the semantics and structures of the original scenes.
Due to the lack of emotion editing datasets, we provide a unique dataset consisting of 340,000 pairs of images and their emotion annotations.
arXiv Detail & Related papers (2024-03-13T05:13:17Z) - EmoGen: Emotional Image Content Generation with Text-to-Image Diffusion
Models [11.901294654242376]
We introduce Emotional Image Content Generation (EICG), a new task to generate semantic-clear and emotion-faithful images given emotion categories.
Specifically, we propose an emotion space and construct a mapping network to align it with the powerful Contrastive Language-Image Pre-training (CLIP) space.
Our method outperforms the state-of-the-art text-to-image approaches both quantitatively and qualitatively.
arXiv Detail & Related papers (2024-01-09T15:23:21Z) - StyleEDL: Style-Guided High-order Attention Network for Image Emotion
Distribution Learning [69.06749934902464]
We propose a style-guided high-order attention network for image emotion distribution learning termed StyleEDL.
StyleEDL interactively learns stylistic-aware representations of images by exploring the hierarchical stylistic information of visual contents.
In addition, we introduce a stylistic graph convolutional network to dynamically generate the content-dependent emotion representations.
arXiv Detail & Related papers (2023-08-06T03:22:46Z) - EmoSet: A Large-scale Visual Emotion Dataset with Rich Attributes [53.95428298229396]
We introduce EmoSet, the first large-scale visual emotion dataset annotated with rich attributes.
EmoSet comprises 3.3 million images in total, with 118,102 of these images carefully labeled by human annotators.
Motivated by psychological studies, in addition to emotion category, each image is also annotated with a set of describable emotion attributes.
arXiv Detail & Related papers (2023-07-16T06:42:46Z) - High-Level Context Representation for Emotion Recognition in Images [4.987022981158291]
We propose an approach for high-level context representation extraction from images.
The model relies on a single cue and a single encoding stream to correlate this representation with emotions.
Our approach is more efficient than previous models and can be easily deployed to address real-world problems related to emotion recognition.
arXiv Detail & Related papers (2023-05-05T13:20:41Z) - ViNTER: Image Narrative Generation with Emotion-Arc-Aware Transformer [59.05857591535986]
We propose a model called ViNTER to generate image narratives that focus on time series representing varying emotions as "emotion arcs"
We present experimental results of both manual and automatic evaluations.
arXiv Detail & Related papers (2022-02-15T10:53:08Z) - SOLVER: Scene-Object Interrelated Visual Emotion Reasoning Network [83.27291945217424]
We propose a novel Scene-Object interreLated Visual Emotion Reasoning network (SOLVER) to predict emotions from images.
To mine the emotional relationships between distinct objects, we first build up an Emotion Graph based on semantic concepts and visual features.
We also design a Scene-Object Fusion Module to integrate scenes and objects, which exploits scene features to guide the fusion process of object features with the proposed scene-based attention mechanism.
arXiv Detail & Related papers (2021-10-24T02:41:41Z) - Enhancing Cognitive Models of Emotions with Representation Learning [58.2386408470585]
We present a novel deep learning-based framework to generate embedding representations of fine-grained emotions.
Our framework integrates a contextualized embedding encoder with a multi-head probing model.
Our model is evaluated on the Empathetic Dialogue dataset and shows the state-of-the-art result for classifying 32 emotions.
arXiv Detail & Related papers (2021-04-20T16:55:15Z) - Facial Expression Editing with Continuous Emotion Labels [76.36392210528105]
Deep generative models have achieved impressive results in the field of automated facial expression editing.
We propose a model that can be used to manipulate facial expressions in facial images according to continuous two-dimensional emotion labels.
arXiv Detail & Related papers (2020-06-22T13:03:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.