ColorPeel: Color Prompt Learning with Diffusion Models via Color and Shape Disentanglement
- URL: http://arxiv.org/abs/2407.07197v1
- Date: Tue, 9 Jul 2024 19:26:34 GMT
- Title: ColorPeel: Color Prompt Learning with Diffusion Models via Color and Shape Disentanglement
- Authors: Muhammad Atif Butt, Kai Wang, Javier Vazquez-Corral, Joost van de Weijer,
- Abstract summary: We propose to learn specific color prompts tailored to user-selected colors.
Our method, denoted as ColorPeel, successfully assists the T2I models to peel off the novel color prompts.
Our findings represent a significant step towards improving precision and versatility of T2I models.
- Score: 20.45850285936787
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-to-Image (T2I) generation has made significant advancements with the advent of diffusion models. These models exhibit remarkable abilities to produce images based on textual prompts. Current T2I models allow users to specify object colors using linguistic color names. However, these labels encompass broad color ranges, making it difficult to achieve precise color matching. To tackle this challenging task, named color prompt learning, we propose to learn specific color prompts tailored to user-selected colors. Existing T2I personalization methods tend to result in color-shape entanglement. To overcome this, we generate several basic geometric objects in the target color, allowing for color and shape disentanglement during the color prompt learning. Our method, denoted as ColorPeel, successfully assists the T2I models to peel off the novel color prompts from these colored shapes. In the experiments, we demonstrate the efficacy of ColorPeel in achieving precise color generation with T2I models. Furthermore, we generalize ColorPeel to effectively learn abstract attribute concepts, including textures, materials, etc. Our findings represent a significant step towards improving precision and versatility of T2I models, offering new opportunities for creative applications and design tasks. Our project is available at https://moatifbutt.github.io/colorpeel/.
Related papers
- Paint Bucket Colorization Using Anime Character Color Design Sheets [72.66788521378864]
We introduce inclusion matching, which allows the network to understand the relationships between segments.
Our network's training pipeline significantly improves performance in both colorization and consecutive frame colorization.
To support our network's training, we have developed a unique dataset named PaintBucket-Character.
arXiv Detail & Related papers (2024-10-25T09:33:27Z) - L-C4: Language-Based Video Colorization for Creative and Consistent Color [59.069498113050436]
We present Language-based video colorization for Creative and Consistent Colors (L-C4)
Our model is built upon a pre-trained cross-modality generative model.
We propose temporally deformable attention to prevent flickering or color shifts, and cross-clip fusion to maintain long-term color consistency.
arXiv Detail & Related papers (2024-10-07T12:16:21Z) - Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation [150.57983348059528]
PRISM is an algorithm that automatically identifies human-interpretable and transferable prompts.
It can effectively generate desired concepts given only black-box access to T2I models.
Our experiments demonstrate the versatility and effectiveness of PRISM in generating accurate prompts for objects, styles and images.
arXiv Detail & Related papers (2024-03-28T02:35:53Z) - Control Color: Multimodal Diffusion-based Interactive Image Colorization [81.68817300796644]
Control Color (Ctrl Color) is a multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model.
We present an effective way to encode user strokes to enable precise local color manipulation.
We also introduce a novel module based on self-attention and a content-guided deformable autoencoder to address the long-standing issues of color overflow and inaccurate coloring.
arXiv Detail & Related papers (2024-02-16T17:51:13Z) - Fine-Tuning InstructPix2Pix for Advanced Image Colorization [3.4975669723257035]
This paper presents a novel approach to human image colorization by fine-tuning the InstructPix2Pix model.
We fine-tune the model using the IMDB-WIKI dataset, pairing black-and-white images with a diverse set of colorization prompts generated by ChatGPT.
After finetuning, our model outperforms the original InstructPix2Pix model on multiple metrics quantitatively.
arXiv Detail & Related papers (2023-12-08T01:36:49Z) - Language-based Photo Color Adjustment for Graphic Designs [38.43984897069872]
We introduce an interactive language-based approach for photo recoloring.
Our model can predict the source colors and the target regions, and then recolor the target regions with the source colors based on the given language-based instruction.
arXiv Detail & Related papers (2023-08-06T08:53:49Z) - DiffColor: Toward High Fidelity Text-Guided Image Colorization with
Diffusion Models [12.897939032560537]
We propose a new method called DiffColor to recover vivid colors conditioned on a prompt text.
We first fine-tune a pre-trained text-to-image model to generate colorized images using a CLIP-based contrastive loss.
Then we try to obtain an optimized text embedding aligning the colorized image and the text prompt, and a fine-tuned diffusion model enabling high-quality image reconstruction.
Our method can produce vivid and diverse colors with a few iterations, and keep the structure and background intact while having colors well-aligned with the target language guidance.
arXiv Detail & Related papers (2023-08-03T09:38:35Z) - L-CAD: Language-based Colorization with Any-level Descriptions using
Diffusion Priors [62.80068955192816]
We propose a unified model to perform language-based colorization with any-level descriptions.
We leverage the pretrained cross-modality generative model for its robust language understanding and rich color priors.
With the proposed novel sampling strategy, our model achieves instance-aware colorization in diverse and complex scenarios.
arXiv Detail & Related papers (2023-05-24T14:57:42Z) - Improved Diffusion-based Image Colorization via Piggybacked Models [19.807766482434563]
We introduce a colorization model piggybacking on the existing powerful T2I diffusion model.
A diffusion guider is designed to incorporate the pre-trained weights of the latent diffusion model.
A lightness-aware VQVAE will then generate the colorized result with pixel-perfect alignment to the given grayscale image.
arXiv Detail & Related papers (2023-04-21T16:23:24Z) - Color Counting for Fashion, Art, and Design [0.0]
First step in color modelling is to estimate the number of colors in the item / object.
We propose a novel color counting method based on cumulative color histogram.
This work is the first of its kind that addresses the problem of color-counting machine.
arXiv Detail & Related papers (2021-10-13T12:42:15Z) - Semantic-driven Colorization [78.88814849391352]
Recent colorization works implicitly predict the semantic information while learning to colorize black-and-white images.
In this study, we simulate that human-like action to let our network first learn to understand the photo, then colorize it.
arXiv Detail & Related papers (2020-06-13T08:13:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.