Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent
Diffusion Models for Virtual Try-All
- URL: http://arxiv.org/abs/2401.13795v1
- Date: Wed, 24 Jan 2024 20:25:48 GMT
- Title: Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent
Diffusion Models for Virtual Try-All
- Authors: Mehmet Saygin Seyfioglu, Karim Bouyarmane, Suren Kumar, Amir Tavanaei,
Ismail B. Tutar
- Abstract summary: "Diffuse to Choose" is a novel diffusion-based inpainting model that efficiently balances fast inference with the retention of high-fidelity details.
We conduct extensive testing on both in-house and publicly available datasets, and show that Diffuse to Choose is superior to existing zero-shot diffusion inpainting methods.
- Score: 4.191273360964305
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: As online shopping is growing, the ability for buyers to virtually visualize
products in their settings-a phenomenon we define as "Virtual Try-All"-has
become crucial. Recent diffusion models inherently contain a world model,
rendering them suitable for this task within an inpainting context. However,
traditional image-conditioned diffusion models often fail to capture the
fine-grained details of products. In contrast, personalization-driven models
such as DreamPaint are good at preserving the item's details but they are not
optimized for real-time applications. We present "Diffuse to Choose," a novel
diffusion-based image-conditioned inpainting model that efficiently balances
fast inference with the retention of high-fidelity details in a given reference
item while ensuring accurate semantic manipulations in the given scene content.
Our approach is based on incorporating fine-grained features from the reference
image directly into the latent feature maps of the main diffusion model,
alongside with a perceptual loss to further preserve the reference item's
details. We conduct extensive testing on both in-house and publicly available
datasets, and show that Diffuse to Choose is superior to existing zero-shot
diffusion inpainting methods as well as few-shot diffusion personalization
algorithms like DreamPaint.
Related papers
- DiffBrush:Just Painting the Art by Your Hands [20.025612157376138]
Current AI painting ecosystem predominantly relies on text-driven diffusion models (T2I)
We introduce DiffBrush, which is compatible with T2I models and allows users to draw and edit images.
DiffBrush achieves control over the color, semantic, and instance of objects in images by continuously guiding the latent and instance-level attention map.
arXiv Detail & Related papers (2025-02-28T10:01:39Z) - PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference [62.72779589895124]
We make the first attempt to align diffusion models for image inpainting with human aesthetic standards via a reinforcement learning framework.
We train a reward model with a dataset we construct, consisting of nearly 51,000 images annotated with human preferences.
Experiments on inpainting comparison and downstream tasks, such as image extension and 3D reconstruction, demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-29T11:49:39Z) - Improving Virtual Try-On with Garment-focused Diffusion Models [91.95830983115474]
Diffusion models have led to the revolutionizing of generative modeling in numerous image synthesis tasks.
We shape a new Diffusion model, namely GarDiff, which triggers the garment-focused diffusion process.
Experiments on VITON-HD and DressCode datasets demonstrate the superiority of our GarDiff when compared to state-of-the-art VTON approaches.
arXiv Detail & Related papers (2024-09-12T17:55:11Z) - ZePo: Zero-Shot Portrait Stylization with Faster Sampling [61.14140480095604]
This paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps.
We propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control.
arXiv Detail & Related papers (2024-08-10T08:53:41Z) - ZoDi: Zero-Shot Domain Adaptation with Diffusion-Based Image Transfer [13.956618446530559]
This paper proposes a zero-shot domain adaptation method based on diffusion models, called ZoDi.
First, we utilize an off-the-shelf diffusion model to synthesize target-like images by transferring the domain of source images to the target domain.
Secondly, we train the model using both source images and synthesized images with the original representations to learn domain-robust representations.
arXiv Detail & Related papers (2024-03-20T14:58:09Z) - Improving Diffusion Models for Authentic Virtual Try-on in the Wild [53.96244595495942]
This paper considers image-based virtual try-on, which renders an image of a person wearing a curated garment.
We propose a novel diffusion model that improves garment fidelity and generates authentic virtual try-on images.
We present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.
arXiv Detail & Related papers (2024-03-08T08:12:18Z) - Prompt-Propose-Verify: A Reliable Hand-Object-Interaction Data
Generation Framework using Foundational Models [0.0]
Diffusion models when conditioned on text prompts, generate realistic-looking images with intricate details.
But most of these pre-trained models fail to generate accurate images when it comes to human features like hands, teeth, etc.
arXiv Detail & Related papers (2023-12-23T12:59:22Z) - Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional
Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation.
We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z) - Taming the Power of Diffusion Models for High-Quality Virtual Try-On
with Appearance Flow [24.187109053871833]
Virtual try-on is a critical image synthesis task that aims to transfer clothes from one image to another while preserving the details of both humans and clothes.
We propose an exemplar-based inpainting approach that leverages a warping module to guide the diffusion model's generation effectively.
Our approach, namely Diffusion-based Conditional Inpainting for Virtual Try-ON (DCI-VTON), effectively utilizes the power of the diffusion model.
arXiv Detail & Related papers (2023-08-11T12:23:09Z) - Generating images of rare concepts using pre-trained diffusion models [32.5337654536764]
Text-to-image diffusion models can synthesize high-quality images, but they have various limitations.
We show that their limitation is partly due to the long-tail nature of their training data.
We show that rare concepts can be correctly generated by carefully selecting suitable generation seeds in the noise space.
arXiv Detail & Related papers (2023-04-27T20:55:38Z) - Diffusion Art or Digital Forgery? Investigating Data Replication in
Diffusion Models [53.03978584040557]
We study image retrieval frameworks that enable us to compare generated images with training samples and detect when content has been replicated.
Applying our frameworks to diffusion models trained on multiple datasets including Oxford flowers, Celeb-A, ImageNet, and LAION, we discuss how factors such as training set size impact rates of content replication.
arXiv Detail & Related papers (2022-12-07T18:58:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.