Related papers: ColorPeel: Color Prompt Learning with Diffusion Models via Color and Shape Disentanglement

ColorPeel: Color Prompt Learning with Diffusion Models via Color and Shape Disentanglement

URL: http://arxiv.org/abs/2407.07197v1
Date: Tue, 9 Jul 2024 19:26:34 GMT
Title: ColorPeel: Color Prompt Learning with Diffusion Models via Color and Shape Disentanglement
Authors: Muhammad Atif Butt, Kai Wang, Javier Vazquez-Corral, Joost van de Weijer,
Abstract summary: We propose to learn specific color prompts tailored to user-selected colors. Our method, denoted as ColorPeel, successfully assists the T2I models to peel off the novel color prompts. Our findings represent a significant step towards improving precision and versatility of T2I models.
Score: 20.45850285936787
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Text-to-Image (T2I) generation has made significant advancements with the advent of diffusion models. These models exhibit remarkable abilities to produce images based on textual prompts. Current T2I models allow users to specify object colors using linguistic color names. However, these labels encompass broad color ranges, making it difficult to achieve precise color matching. To tackle this challenging task, named color prompt learning, we propose to learn specific color prompts tailored to user-selected colors. Existing T2I personalization methods tend to result in color-shape entanglement. To overcome this, we generate several basic geometric objects in the target color, allowing for color and shape disentanglement during the color prompt learning. Our method, denoted as ColorPeel, successfully assists the T2I models to peel off the novel color prompts from these colored shapes. In the experiments, we demonstrate the efficacy of ColorPeel in achieving precise color generation with T2I models. Furthermore, we generalize ColorPeel to effectively learn abstract attribute concepts, including textures, materials, etc. Our findings represent a significant step towards improving precision and versatility of T2I models, offering new opportunities for creative applications and design tasks. Our project is available at https://moatifbutt.github.io/colorpeel/.

Related papers

MagicColor: Multi-Instance Sketch Colorization [44.72374445094054]
MagicColor is a diffusion-based framework for multi-instance sketch colorization. Our model critically automates the colorization process with zero manual adjustments.
arXiv Detail & Related papers (2025-03-21T08:53:14Z)
Free-Lunch Color-Texture Disentanglement for Stylized Image Generation [58.406368812760256]
This paper introduces the first tuning-free approach to achieve free-lunch color-texture disentanglement in stylized T2I generation. We develop techniques for separating and extracting Color-Texture Embeddings (CTE) from individual color and texture reference images. To ensure that the color palette of the generated image aligns closely with the color reference, we apply a whitening and coloring transformation.
arXiv Detail & Related papers (2025-03-18T14:10:43Z)
ColorFlow: Retrieval-Augmented Image Sequence Colorization [65.93834649502898]
We propose a three-stage diffusion-based framework tailored for image sequence colorization in industrial applications. Unlike existing methods that require per-ID finetuning or explicit ID embedding extraction, we propose a novel Retrieval Augmented Colorization pipeline. Our pipeline also features a dual-branch design: one branch for color identity extraction and the other for colorization.
arXiv Detail & Related papers (2024-12-16T14:32:49Z)
Paint Bucket Colorization Using Anime Character Color Design Sheets [72.66788521378864]
We introduce inclusion matching, which allows the network to understand the relationships between segments. Our network's training pipeline significantly improves performance in both colorization and consecutive frame colorization. To support our network's training, we have developed a unique dataset named PaintBucket-Character.
arXiv Detail & Related papers (2024-10-25T09:33:27Z)
L-C4: Language-Based Video Colorization for Creative and Consistent Color [59.069498113050436]
We present Language-based video colorization for Creative and Consistent Colors (L-C4) Our model is built upon a pre-trained cross-modality generative model. We propose temporally deformable attention to prevent flickering or color shifts, and cross-clip fusion to maintain long-term color consistency.
arXiv Detail & Related papers (2024-10-07T12:16:21Z)
Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation [150.57983348059528]
PRISM is an algorithm that automatically identifies human-interpretable and transferable prompts. It can effectively generate desired concepts given only black-box access to T2I models. Our experiments demonstrate the versatility and effectiveness of PRISM in generating accurate prompts for objects, styles and images.
arXiv Detail & Related papers (2024-03-28T02:35:53Z)
Control Color: Multimodal Diffusion-based Interactive Image Colorization [81.68817300796644]
Control Color (Ctrl Color) is a multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model. We present an effective way to encode user strokes to enable precise local color manipulation. We also introduce a novel module based on self-attention and a content-guided deformable autoencoder to address the long-standing issues of color overflow and inaccurate coloring.
arXiv Detail & Related papers (2024-02-16T17:51:13Z)
Fine-Tuning InstructPix2Pix for Advanced Image Colorization [3.4975669723257035]
This paper presents a novel approach to human image colorization by fine-tuning the InstructPix2Pix model. We fine-tune the model using the IMDB-WIKI dataset, pairing black-and-white images with a diverse set of colorization prompts generated by ChatGPT. After finetuning, our model outperforms the original InstructPix2Pix model on multiple metrics quantitatively.
arXiv Detail & Related papers (2023-12-08T01:36:49Z)
Language-based Photo Color Adjustment for Graphic Designs [38.43984897069872]
We introduce an interactive language-based approach for photo recoloring. Our model can predict the source colors and the target regions, and then recolor the target regions with the source colors based on the given language-based instruction.
arXiv Detail & Related papers (2023-08-06T08:53:49Z)
DiffColor: Toward High Fidelity Text-Guided Image Colorization with Diffusion Models [12.897939032560537]
We propose a new method called DiffColor to recover vivid colors conditioned on a prompt text. We first fine-tune a pre-trained text-to-image model to generate colorized images using a CLIP-based contrastive loss. Then we try to obtain an optimized text embedding aligning the colorized image and the text prompt, and a fine-tuned diffusion model enabling high-quality image reconstruction. Our method can produce vivid and diverse colors with a few iterations, and keep the structure and background intact while having colors well-aligned with the target language guidance.
arXiv Detail & Related papers (2023-08-03T09:38:35Z)
L-CAD: Language-based Colorization with Any-level Descriptions using Diffusion Priors [62.80068955192816]
We propose a unified model to perform language-based colorization with any-level descriptions. We leverage the pretrained cross-modality generative model for its robust language understanding and rich color priors. With the proposed novel sampling strategy, our model achieves instance-aware colorization in diverse and complex scenarios.
arXiv Detail & Related papers (2023-05-24T14:57:42Z)
Improved Diffusion-based Image Colorization via Piggybacked Models [19.807766482434563]
We introduce a colorization model piggybacking on the existing powerful T2I diffusion model. A diffusion guider is designed to incorporate the pre-trained weights of the latent diffusion model. A lightness-aware VQVAE will then generate the colorized result with pixel-perfect alignment to the given grayscale image.
arXiv Detail & Related papers (2023-04-21T16:23:24Z)
Color Counting for Fashion, Art, and Design [0.0]
First step in color modelling is to estimate the number of colors in the item / object. We propose a novel color counting method based on cumulative color histogram. This work is the first of its kind that addresses the problem of color-counting machine.
arXiv Detail & Related papers (2021-10-13T12:42:15Z)
Semantic-driven Colorization [78.88814849391352]
Recent colorization works implicitly predict the semantic information while learning to colorize black-and-white images. In this study, we simulate that human-like action to let our network first learn to understand the photo, then colorize it.
arXiv Detail & Related papers (2020-06-13T08:13:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.