LIPE: Learning Personalized Identity Prior for Non-rigid Image Editing
- URL: http://arxiv.org/abs/2406.17236v1
- Date: Tue, 25 Jun 2024 02:56:16 GMT
- Title: LIPE: Learning Personalized Identity Prior for Non-rigid Image Editing
- Authors: Aoyang Liu, Qingnan Fan, Shuai Qin, Hong Gu, Yansong Tang,
- Abstract summary: LIPE is a two-stage framework designed to customize the generative model utilizing a limited set of images of the same subject.
We present LIPE, a two-stage framework designed to customize the generative model utilizing a limited set of images of the same subject, and subsequently employ the model with learned prior for non-rigid image editing.
- Score: 20.861672583434718
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although recent years have witnessed significant advancements in image editing thanks to the remarkable progress of text-to-image diffusion models, the problem of non-rigid image editing still presents its complexities and challenges. Existing methods often fail to achieve consistent results due to the absence of unique identity characteristics. Thus, learning a personalized identity prior might help with consistency in the edited results. In this paper, we explore a novel task: learning the personalized identity prior for text-based non-rigid image editing. To address the problems in jointly learning prior and editing the image, we present LIPE, a two-stage framework designed to customize the generative model utilizing a limited set of images of the same subject, and subsequently employ the model with learned prior for non-rigid image editing. Experimental results demonstrate the advantages of our approach in various editing scenarios over past related leading methods in qualitative and quantitative ways.
Related papers
- Learning Feature-Preserving Portrait Editing from Generated Pairs [11.122956539965761]
We propose a training-based method leveraging auto-generated paired data to learn desired editing.
Our method achieves state-of-the-art quality, quantitatively and qualitatively.
arXiv Detail & Related papers (2024-07-29T23:19:42Z) - JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation [49.997839600988875]
Existing personalization methods rely on finetuning a text-to-image foundation model on a user's custom dataset.
We propose Joint-Image Diffusion (jedi), an effective technique for learning a finetuning-free personalization model.
Our model achieves state-of-the-art generation quality, both quantitatively and qualitatively, significantly outperforming both the prior finetuning-based and finetuning-free personalization baselines.
arXiv Detail & Related papers (2024-07-08T17:59:02Z) - MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation [59.13765130528232]
We present MasterWeaver, a test-time tuning-free method designed to generate personalized images with both faithful identity fidelity and flexible editability.
Specifically, MasterWeaver adopts an encoder to extract identity features and steers the image generation through additional introduced cross attention.
To improve editability while maintaining identity fidelity, we propose an editing direction loss for training, which aligns the editing directions of our MasterWeaver with those of the original T2I model.
arXiv Detail & Related papers (2024-05-09T14:42:16Z) - Instilling Multi-round Thinking to Text-guided Image Generation [72.2032630115201]
Single-round generation often overlooks crucial details, particularly in the realm of fine-grained changes like shoes or sleeves.
We introduce a new self-supervised regularization, ie, multi-round regularization, which is compatible with existing methods.
It builds upon the observation that the modification order generally should not affect the final result.
arXiv Detail & Related papers (2024-01-16T16:19:58Z) - AdapEdit: Spatio-Temporal Guided Adaptive Editing Algorithm for
Text-Based Continuity-Sensitive Image Editing [24.9487669818162]
We propose atemporal guided adaptive editing algorithm AdapEdit, which realizes adaptive image editing.
Our approach has a significant advantage in preserving model priors and does not require model training, fine-tuning extra data, or optimization.
We present our results over a wide variety of raw images and editing instructions, demonstrating competitive performance and showing it significantly outperforms the previous approaches.
arXiv Detail & Related papers (2023-12-13T09:45:58Z) - PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion
Models [19.519789922033034]
PhotoVerse is an innovative methodology that incorporates a dual-branch conditioning mechanism in both text and image domains.
After a single training phase, our approach enables generating high-quality images within only a few seconds.
arXiv Detail & Related papers (2023-09-11T19:59:43Z) - Paste, Inpaint and Harmonize via Denoising: Subject-Driven Image Editing
with Pre-Trained Diffusion Model [22.975965453227477]
We introduce a new framework called textitPaste, Inpaint and Harmonize via Denoising (PhD)
In our experiments, we apply PhD to both subject-driven image editing tasks and explore text-driven scene generation given a reference subject.
arXiv Detail & Related papers (2023-06-13T07:43:10Z) - Cones 2: Customizable Image Synthesis with Multiple Subjects [50.54010141032032]
We study how to efficiently represent a particular subject as well as how to appropriately compose different subjects.
By rectifying the activations in the cross-attention map, the layout appoints and separates the location of different subjects in the image.
arXiv Detail & Related papers (2023-05-30T18:00:06Z) - End-to-End Visual Editing with a Generatively Pre-Trained Artist [78.5922562526874]
We consider the targeted image editing problem: blending a region in a source image with a driver image that specifies the desired change.
We propose a self-supervised approach that simulates edits by augmenting off-the-shelf images in a target domain.
We show that different blending effects can be learned by an intuitive control of the augmentation process, with no other changes required to the model architecture.
arXiv Detail & Related papers (2022-05-03T17:59:30Z) - Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space
Navigation [136.53288628437355]
Controllable semantic image editing enables a user to change entire image attributes with few clicks.
Current approaches often suffer from attribute edits that are entangled, global image identity changes, and diminished photo-realism.
We propose quantitative evaluation strategies for measuring controllable editing performance, unlike prior work which primarily focuses on qualitative evaluation.
arXiv Detail & Related papers (2021-02-01T21:38:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.