Related papers: Human Aesthetic Preference-Based Large Text-to-Image Model Personalization: Kandinsky Generation as an Example

Human Aesthetic Preference-Based Large Text-to-Image Model Personalization: Kandinsky Generation as an Example

URL: http://arxiv.org/abs/2402.06389v1
Date: Fri, 9 Feb 2024 13:11:19 GMT
Title: Human Aesthetic Preference-Based Large Text-to-Image Model Personalization: Kandinsky Generation as an Example
Authors: Aven-Le Zhou, Yu-Ao Wang, Wei Wu and Kang Zhang
Abstract summary: This paper introduces a prompting-free generative approach that empowers users to automatically generate personalized painterly content. By relying on the user's aesthetic evaluation and preference for the artist model-generated images, this approach creates the user a personalized model.
Score: 4.744780823386797
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the advancement of neural generative capabilities, the art community has actively embraced GenAI (generative artificial intelligence) for creating painterly content. Large text-to-image models can quickly generate aesthetically pleasing outcomes. However, the process can be non-deterministic and often involves tedious trial-and-error, as users struggle with formulating effective prompts to achieve their desired results. This paper introduces a prompting-free generative approach that empowers users to automatically generate personalized painterly content that incorporates their aesthetic preferences in a customized artistic style. This approach involves utilizing ``semantic injection'' to customize an artist model in a specific artistic style, and further leveraging a genetic algorithm to optimize the prompt generation process through real-time iterative human feedback. By solely relying on the user's aesthetic evaluation and preference for the artist model-generated images, this approach creates the user a personalized model that encompasses their aesthetic preferences and the customized artistic style.

Related papers

IntroStyle: Training-Free Introspective Style Attribution using Diffusion Features [89.95303251220734]
We present a training-free framework to solve the style attribution problem, using the features produced by a diffusion model alone. This is denoted as introspective style attribution (IntroStyle) and demonstrates superior performance to state-of-the-art models for style retrieval. We also introduce a synthetic dataset of Style Hacks (SHacks) to isolate artistic style and evaluate fine-grained style attribution performance.
arXiv Detail & Related papers (2024-12-19T01:21:23Z)
Computational Modeling of Artistic Inspiration: A Framework for Predicting Aesthetic Preferences in Lyrical Lines Using Linguistic and Stylistic Features [8.205321096201095]
Artistic inspiration plays a crucial role in producing works that resonate deeply with audiences. This work proposes a novel framework for computationally modeling artistic preferences in different individuals. Our framework outperforms an out-of-the-box LLaMA-3-70b, a state-of-the-art open-source language model, by nearly 18 points.
arXiv Detail & Related papers (2024-10-03T18:10:16Z)
JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation [49.997839600988875]
Existing personalization methods rely on finetuning a text-to-image foundation model on a user's custom dataset. We propose Joint-Image Diffusion (jedi), an effective technique for learning a finetuning-free personalization model. Our model achieves state-of-the-art generation quality, both quantitatively and qualitatively, significantly outperforming both the prior finetuning-based and finetuning-free personalization baselines.
arXiv Detail & Related papers (2024-07-08T17:59:02Z)
CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion [74.44273919041912]
Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images. However, adapting these models for artistic image editing presents two significant challenges. We build the innovative unified framework Creative Synth, which is based on a diffusion model with the ability to coordinate multimodal inputs.
arXiv Detail & Related papers (2024-01-25T10:42:09Z)
Impressions: Understanding Visual Semiotics and Aesthetic Impact [66.40617566253404]
We present Impressions, a novel dataset through which to investigate the semiotics of images. We show that existing multimodal image captioning and conditional generation models struggle to simulate plausible human responses to images. This dataset significantly improves their ability to model impressions and aesthetic evaluations of images through fine-tuning and few-shot adaptation.
arXiv Detail & Related papers (2023-10-27T04:30:18Z)
Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack [75.00066365801993]
Training text-to-image models with web scale image-text pairs enables the generation of a wide range of visual concepts from text. These pre-trained models often face challenges when it comes to generating highly aesthetic images. We propose quality-tuning to guide a pre-trained model to exclusively generate highly visually appealing images.
arXiv Detail & Related papers (2023-09-27T17:30:19Z)
Inventing art styles with no artistic training data [0.65268245109828]
We propose two procedures to create painting styles using models trained only on natural images. In the first procedure we use the inductive bias from the artistic medium to achieve creative expression. The second procedure uses an additional natural image as inspiration to create a new style.
arXiv Detail & Related papers (2023-05-19T21:59:23Z)
Learning to Evaluate the Artness of AI-generated Images [64.48229009396186]
ArtScore is a metric designed to evaluate the degree to which an image resembles authentic artworks by artists. We employ pre-trained models for photo and artwork generation, resulting in a series of mixed models. This dataset is then employed to train a neural network that learns to estimate quantized artness levels of arbitrary images.
arXiv Detail & Related papers (2023-05-08T17:58:27Z)
Few-shots Portrait Generation with Style Enhancement and Identity Preservation [3.6937810031393123]
StyleIdentityGAN model can ensure the identity and artistry of the generated portrait at the same time. Style-enhanced module focuses on artistic style features decoupling and transferring to improve the artistry of generated virtual face images. Experiments demonstrate the superiority of StyleIdentityGAN over state-of-art methods in artistry and identity effects.
arXiv Detail & Related papers (2023-03-01T10:02:12Z)
AesUST: Towards Aesthetic-Enhanced Universal Style Transfer [15.078430702469886]
AesUST is a novel Aesthetic-enhanced Universal Style Transfer approach. We introduce an aesthetic discriminator to learn the universal human-delightful aesthetic features from a large corpus of artist-created paintings. We also develop a new two-stage transfer training strategy with two aesthetic regularizations to train our model more effectively.
arXiv Detail & Related papers (2022-08-27T13:51:11Z)
User-Guided Personalized Image Aesthetic Assessment based on Deep Reinforcement Learning [64.07820203919283]
We propose a novel user-guided personalized image aesthetic assessment framework. It leverages user interactions to retouch and rank images for aesthetic assessment based on deep reinforcement learning (DRL) It generates personalized aesthetic distribution that is more in line with the aesthetic preferences of different users.
arXiv Detail & Related papers (2021-06-14T15:19:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.