Color Conditional Generation with Sliced Wasserstein Guidance
- URL: http://arxiv.org/abs/2503.19034v1
- Date: Mon, 24 Mar 2025 18:06:03 GMT
- Title: Color Conditional Generation with Sliced Wasserstein Guidance
- Authors: Alexander Lobashev, Maria Larchenko, Dmitry Guskov,
- Abstract summary: SW-Guidance is a training-free approach for image generation conditioned on the color distribution of a reference image.<n>Our method outperforms state-of-the-art techniques for color-conditional generation in terms of color similarity to the reference.
- Score: 44.99833362998488
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose SW-Guidance, a training-free approach for image generation conditioned on the color distribution of a reference image. While it is possible to generate an image with fixed colors by first creating an image from a text prompt and then applying a color style transfer method, this approach often results in semantically meaningless colors in the generated image. Our method solves this problem by modifying the sampling process of a diffusion model to incorporate the differentiable Sliced 1-Wasserstein distance between the color distribution of the generated image and the reference palette. Our method outperforms state-of-the-art techniques for color-conditional generation in terms of color similarity to the reference, producing images that not only match the reference colors but also maintain semantic coherence with the original text prompt. Our source code is available at https://github.com/alobashev/sw-guidance/.
Related papers
- Free-Lunch Color-Texture Disentanglement for Stylized Image Generation [58.406368812760256]
This paper introduces the first tuning-free approach to achieve free-lunch color-texture disentanglement in stylized T2I generation.<n>We develop techniques for separating and extracting Color-Texture Embeddings (CTE) from individual color and texture reference images.<n>To ensure that the color palette of the generated image aligns closely with the color reference, we apply a whitening and coloring transformation.
arXiv Detail & Related papers (2025-03-18T14:10:43Z) - Training-free Color-Style Disentanglement for Constrained Text-to-Image Synthesis [16.634138745034733]
We present the first training-free, test-time-only method to disentangle and condition text-to-image models on color and style attributes from reference image.
arXiv Detail & Related papers (2024-09-04T04:16:58Z) - Palette-based Color Transfer between Images [9.471264982229508]
We propose a new palette-based color transfer method that can automatically generate a new color scheme.
With a redesigned palette-based clustering method, pixels can be classified into different segments according to color distribution.
Our method exhibits significant advantages over peer methods in terms of natural realism, color consistency, generality, and robustness.
arXiv Detail & Related papers (2024-05-14T01:41:19Z) - Automatic Controllable Colorization via Imagination [55.489416987587305]
We propose a framework for automatic colorization that allows for iterative editing and modifications.
By understanding the content within a grayscale image, we utilize a pre-trained image generation model to generate multiple images that contain the same content.
These images serve as references for coloring, mimicking the process of human experts.
arXiv Detail & Related papers (2024-04-08T16:46:07Z) - Control Color: Multimodal Diffusion-based Interactive Image Colorization [81.68817300796644]
Control Color (Ctrl Color) is a multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model.
We present an effective way to encode user strokes to enable precise local color manipulation.
We also introduce a novel module based on self-attention and a content-guided deformable autoencoder to address the long-standing issues of color overflow and inaccurate coloring.
arXiv Detail & Related papers (2024-02-16T17:51:13Z) - DiffColor: Toward High Fidelity Text-Guided Image Colorization with
Diffusion Models [12.897939032560537]
We propose a new method called DiffColor to recover vivid colors conditioned on a prompt text.
We first fine-tune a pre-trained text-to-image model to generate colorized images using a CLIP-based contrastive loss.
Then we try to obtain an optimized text embedding aligning the colorized image and the text prompt, and a fine-tuned diffusion model enabling high-quality image reconstruction.
Our method can produce vivid and diverse colors with a few iterations, and keep the structure and background intact while having colors well-aligned with the target language guidance.
arXiv Detail & Related papers (2023-08-03T09:38:35Z) - BiSTNet: Semantic Image Prior Guided Bidirectional Temporal Feature
Fusion for Deep Exemplar-based Video Colorization [70.14893481468525]
We present an effective BiSTNet to explore colors of reference exemplars and utilize them to help video colorization.
We first establish the semantic correspondence between each frame and the reference exemplars in deep feature space to explore color information from reference exemplars.
We develop a mixed expert block to extract semantic information for modeling the object boundaries of frames so that the semantic image prior can better guide the colorization process.
arXiv Detail & Related papers (2022-12-05T13:47:15Z) - Deep Line Art Video Colorization with a Few References [49.7139016311314]
We propose a deep architecture to automatically color line art videos with the same color style as the given reference images.
Our framework consists of a color transform network and a temporal constraint network.
Our model can achieve even better coloring results by fine-tuning the parameters with only a small amount of samples.
arXiv Detail & Related papers (2020-03-24T06:57:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.