A Method for Training-free Person Image Picture Generation
- URL: http://arxiv.org/abs/2305.09817v1
- Date: Tue, 16 May 2023 21:46:28 GMT
- Title: A Method for Training-free Person Image Picture Generation
- Authors: Tianyu Chen
- Abstract summary: A Character Image Feature model is proposed in this paper.
It enables the user to use the process by simply providing a picture of the character to make the image of the character in the generated image match the expectation.
The proposed model can be conveniently incorporated into the Stable Diffusion generation process without modifying the model's or used in combination with Stable Diffusion as a joint model.
- Score: 4.043367784553845
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The current state-of-the-art Diffusion model has demonstrated excellent
results in generating images. However, the images are monotonous and are mostly
the result of the distribution of images of people in the training set, making
it challenging to generate multiple images for a fixed number of individuals.
This problem can often only be solved by fine-tuning the training of the model.
This means that each individual/animated character image must be trained if it
is to be drawn, and the hardware and cost of this training is often beyond the
reach of the average user, who accounts for the largest number of people. To
solve this problem, the Character Image Feature Encoder model proposed in this
paper enables the user to use the process by simply providing a picture of the
character to make the image of the character in the generated image match the
expectation. In addition, various details can be adjusted during the process
using prompts. Unlike traditional Image-to-Image models, the Character Image
Feature Encoder extracts only the relevant image features, rather than
information about the model's composition or movements. In addition, the
Character Image Feature Encoder can be adapted to different models after
training. The proposed model can be conveniently incorporated into the Stable
Diffusion generation process without modifying the model's ontology or used in
combination with Stable Diffusion as a joint model.
Related papers
- JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation [49.997839600988875]
Existing personalization methods rely on finetuning a text-to-image foundation model on a user's custom dataset.
We propose Joint-Image Diffusion (jedi), an effective technique for learning a finetuning-free personalization model.
Our model achieves state-of-the-art generation quality, both quantitatively and qualitatively, significantly outperforming both the prior finetuning-based and finetuning-free personalization baselines.
arXiv Detail & Related papers (2024-07-08T17:59:02Z) - Conditional Diffusion on Web-Scale Image Pairs leads to Diverse Image Variations [32.892042877725125]
Current image variation techniques involve adapting a text-to-image model to reconstruct an input image conditioned on the same image.
We show that a diffusion model trained to reconstruct an input image from frozen embeddings, can reconstruct the image with minor variations.
We propose a new pretraining strategy to generate image variations using a large collection of image pairs.
arXiv Detail & Related papers (2024-05-23T17:58:03Z) - Evaluating Data Attribution for Text-to-Image Models [62.844382063780365]
We evaluate attribution through "customization" methods, which tune an existing large-scale model toward a given exemplar object or style.
Our key insight is that this allows us to efficiently create synthetic images that are computationally influenced by the exemplar by construction.
By taking into account the inherent uncertainty of the problem, we can assign soft attribution scores over a set of training images.
arXiv Detail & Related papers (2023-06-15T17:59:51Z) - Identity Encoder for Personalized Diffusion [57.1198884486401]
We propose an encoder-based approach for personalization.
We learn an identity encoder which can extract an identity representation from a set of reference images of a subject.
We show that our approach consistently outperforms existing fine-tuning based approach in both image generation and reconstruction.
arXiv Detail & Related papers (2023-04-14T23:32:24Z) - BlendGAN: Learning and Blending the Internal Distributions of Single
Images by Spatial Image-Identity Conditioning [37.21764919074815]
Single image generative methods are designed to learn the internal patch distribution of a single natural image at multiple scales.
We introduce an extended framework, which allows to simultaneously learn the internal distributions of several images.
Our BlendGAN opens the door to applications that are not supported by single-image models.
arXiv Detail & Related papers (2022-12-03T10:38:27Z) - Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis.
Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z) - EdiBERT, a generative model for image editing [12.605607949417033]
EdiBERT is a bi-directional transformer trained in the discrete latent space built by a vector-quantized auto-encoder.
We show that the resulting model matches state-of-the-art performances on a wide variety of tasks.
arXiv Detail & Related papers (2021-11-30T10:23:06Z) - Meta Internal Learning [88.68276505511922]
Internal learning for single-image generation is a framework, where a generator is trained to produce novel images based on a single image.
We propose a meta-learning approach that enables training over a collection of images, in order to model the internal statistics of the sample image more effectively.
Our results show that the models obtained are as suitable as single-image GANs for many common image applications.
arXiv Detail & Related papers (2021-10-06T16:27:38Z) - Generating Person Images with Appearance-aware Pose Stylizer [66.44220388377596]
We present a novel end-to-end framework to generate realistic person images based on given person poses and appearances.
The core of our framework is a novel generator called Appearance-aware Pose Stylizer (APS) which generates human images by coupling the target pose with the conditioned person appearance progressively.
arXiv Detail & Related papers (2020-07-17T15:58:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.