Human Aesthetic Preference-Based Large Text-to-Image Model
Personalization: Kandinsky Generation as an Example
- URL: http://arxiv.org/abs/2402.06389v1
- Date: Fri, 9 Feb 2024 13:11:19 GMT
- Title: Human Aesthetic Preference-Based Large Text-to-Image Model
Personalization: Kandinsky Generation as an Example
- Authors: Aven-Le Zhou, Yu-Ao Wang, Wei Wu and Kang Zhang
- Abstract summary: This paper introduces a prompting-free generative approach that empowers users to automatically generate personalized painterly content.
By relying on the user's aesthetic evaluation and preference for the artist model-generated images, this approach creates the user a personalized model.
- Score: 4.744780823386797
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the advancement of neural generative capabilities, the art community has
actively embraced GenAI (generative artificial intelligence) for creating
painterly content. Large text-to-image models can quickly generate
aesthetically pleasing outcomes. However, the process can be non-deterministic
and often involves tedious trial-and-error, as users struggle with formulating
effective prompts to achieve their desired results. This paper introduces a
prompting-free generative approach that empowers users to automatically
generate personalized painterly content that incorporates their aesthetic
preferences in a customized artistic style. This approach involves utilizing
``semantic injection'' to customize an artist model in a specific artistic
style, and further leveraging a genetic algorithm to optimize the prompt
generation process through real-time iterative human feedback. By solely
relying on the user's aesthetic evaluation and preference for the artist
model-generated images, this approach creates the user a personalized model
that encompasses their aesthetic preferences and the customized artistic style.
Related papers
- Computational Modeling of Artistic Inspiration: A Framework for Predicting Aesthetic Preferences in Lyrical Lines Using Linguistic and Stylistic Features [8.205321096201095]
Artistic inspiration plays a crucial role in producing works that resonate deeply with audiences.
This work proposes a novel framework for computationally modeling artistic preferences in different individuals.
Our framework outperforms an out-of-the-box LLaMA-3-70b, a state-of-the-art open-source language model, by nearly 18 points.
arXiv Detail & Related papers (2024-10-03T18:10:16Z) - JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation [49.997839600988875]
Existing personalization methods rely on finetuning a text-to-image foundation model on a user's custom dataset.
We propose Joint-Image Diffusion (jedi), an effective technique for learning a finetuning-free personalization model.
Our model achieves state-of-the-art generation quality, both quantitatively and qualitatively, significantly outperforming both the prior finetuning-based and finetuning-free personalization baselines.
arXiv Detail & Related papers (2024-07-08T17:59:02Z) - CreativeSynth: Creative Blending and Synthesis of Visual Arts based on
Multimodal Diffusion [74.44273919041912]
Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images.
However, adapting these models for artistic image editing presents two significant challenges.
We build the innovative unified framework Creative Synth, which is based on a diffusion model with the ability to coordinate multimodal inputs.
arXiv Detail & Related papers (2024-01-25T10:42:09Z) - Impressions: Understanding Visual Semiotics and Aesthetic Impact [66.40617566253404]
We present Impressions, a novel dataset through which to investigate the semiotics of images.
We show that existing multimodal image captioning and conditional generation models struggle to simulate plausible human responses to images.
This dataset significantly improves their ability to model impressions and aesthetic evaluations of images through fine-tuning and few-shot adaptation.
arXiv Detail & Related papers (2023-10-27T04:30:18Z) - Emu: Enhancing Image Generation Models Using Photogenic Needles in a
Haystack [75.00066365801993]
Training text-to-image models with web scale image-text pairs enables the generation of a wide range of visual concepts from text.
These pre-trained models often face challenges when it comes to generating highly aesthetic images.
We propose quality-tuning to guide a pre-trained model to exclusively generate highly visually appealing images.
arXiv Detail & Related papers (2023-09-27T17:30:19Z) - Inventing art styles with no artistic training data [0.65268245109828]
We propose two procedures to create painting styles using models trained only on natural images.
In the first procedure we use the inductive bias from the artistic medium to achieve creative expression.
The second procedure uses an additional natural image as inspiration to create a new style.
arXiv Detail & Related papers (2023-05-19T21:59:23Z) - Learning to Evaluate the Artness of AI-generated Images [64.48229009396186]
ArtScore is a metric designed to evaluate the degree to which an image resembles authentic artworks by artists.
We employ pre-trained models for photo and artwork generation, resulting in a series of mixed models.
This dataset is then employed to train a neural network that learns to estimate quantized artness levels of arbitrary images.
arXiv Detail & Related papers (2023-05-08T17:58:27Z) - Few-shots Portrait Generation with Style Enhancement and Identity
Preservation [3.6937810031393123]
StyleIdentityGAN model can ensure the identity and artistry of the generated portrait at the same time.
Style-enhanced module focuses on artistic style features decoupling and transferring to improve the artistry of generated virtual face images.
Experiments demonstrate the superiority of StyleIdentityGAN over state-of-art methods in artistry and identity effects.
arXiv Detail & Related papers (2023-03-01T10:02:12Z) - AesUST: Towards Aesthetic-Enhanced Universal Style Transfer [15.078430702469886]
AesUST is a novel Aesthetic-enhanced Universal Style Transfer approach.
We introduce an aesthetic discriminator to learn the universal human-delightful aesthetic features from a large corpus of artist-created paintings.
We also develop a new two-stage transfer training strategy with two aesthetic regularizations to train our model more effectively.
arXiv Detail & Related papers (2022-08-27T13:51:11Z) - User-Guided Personalized Image Aesthetic Assessment based on Deep
Reinforcement Learning [64.07820203919283]
We propose a novel user-guided personalized image aesthetic assessment framework.
It leverages user interactions to retouch and rank images for aesthetic assessment based on deep reinforcement learning (DRL)
It generates personalized aesthetic distribution that is more in line with the aesthetic preferences of different users.
arXiv Detail & Related papers (2021-06-14T15:19:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.