Control Color: Multimodal Diffusion-based Interactive Image Colorization
- URL: http://arxiv.org/abs/2402.10855v1
- Date: Fri, 16 Feb 2024 17:51:13 GMT
- Title: Control Color: Multimodal Diffusion-based Interactive Image Colorization
- Authors: Zhexin Liang, Zhaochen Li, Shangchen Zhou, Chongyi Li, Chen Change Loy
- Abstract summary: Control Color (Ctrl Color) is a multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model.
We present an effective way to encode user strokes to enable precise local color manipulation.
We also introduce a novel module based on self-attention and a content-guided deformable autoencoder to address the long-standing issues of color overflow and inaccurate coloring.
- Score: 81.68817300796644
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the existence of numerous colorization methods, several limitations
still exist, such as lack of user interaction, inflexibility in local
colorization, unnatural color rendering, insufficient color variation, and
color overflow. To solve these issues, we introduce Control Color (CtrlColor),
a multi-modal colorization method that leverages the pre-trained Stable
Diffusion (SD) model, offering promising capabilities in highly controllable
interactive image colorization. While several diffusion-based methods have been
proposed, supporting colorization in multiple modalities remains non-trivial.
In this study, we aim to tackle both unconditional and conditional image
colorization (text prompts, strokes, exemplars) and address color overflow and
incorrect color within a unified framework. Specifically, we present an
effective way to encode user strokes to enable precise local color manipulation
and employ a practical way to constrain the color distribution similar to
exemplars. Apart from accepting text prompts as conditions, these designs add
versatility to our approach. We also introduce a novel module based on
self-attention and a content-guided deformable autoencoder to address the
long-standing issues of color overflow and inaccurate coloring. Extensive
comparisons show that our model outperforms state-of-the-art image colorization
methods both qualitatively and quantitatively.
Related papers
- MangaNinja: Line Art Colorization with Precise Reference Following [84.2001766692797]
MangaNinjia specializes in the task of reference-guided line art colorization.
We incorporate two thoughtful designs to ensure precise character detail transcription.
A patch shuffling module to facilitate correspondence learning between the reference color image and the target line art, and a point-driven control scheme to enable fine-grained color matching.
arXiv Detail & Related papers (2025-01-14T18:59:55Z) - ColorFlow: Retrieval-Augmented Image Sequence Colorization [65.93834649502898]
We propose a three-stage diffusion-based framework tailored for image sequence colorization in industrial applications.
Unlike existing methods that require per-ID finetuning or explicit ID embedding extraction, we propose a novel Retrieval Augmented Colorization pipeline.
Our pipeline also features a dual-branch design: one branch for color identity extraction and the other for colorization.
arXiv Detail & Related papers (2024-12-16T14:32:49Z) - Paint Bucket Colorization Using Anime Character Color Design Sheets [72.66788521378864]
We introduce inclusion matching, which allows the network to understand the relationships between segments.
Our network's training pipeline significantly improves performance in both colorization and consecutive frame colorization.
To support our network's training, we have developed a unique dataset named PaintBucket-Character.
arXiv Detail & Related papers (2024-10-25T09:33:27Z) - L-C4: Language-Based Video Colorization for Creative and Consistent Color [59.069498113050436]
We present Language-based video colorization for Creative and Consistent Colors (L-C4)
Our model is built upon a pre-trained cross-modality generative model.
We propose temporally deformable attention to prevent flickering or color shifts, and cross-clip fusion to maintain long-term color consistency.
arXiv Detail & Related papers (2024-10-07T12:16:21Z) - MultiColor: Image Colorization by Learning from Multiple Color Spaces [4.738828630428634]
MultiColor is a new learning-based approach to automatically colorize grayscale images.
We employ a set of dedicated colorization modules for individual color space.
With these predicted color channels representing various color spaces, a complementary network is designed to exploit the complementarity and generate pleasing and reasonable colorized images.
arXiv Detail & Related papers (2024-08-08T02:34:41Z) - Automatic Controllable Colorization via Imagination [55.489416987587305]
We propose a framework for automatic colorization that allows for iterative editing and modifications.
By understanding the content within a grayscale image, we utilize a pre-trained image generation model to generate multiple images that contain the same content.
These images serve as references for coloring, mimicking the process of human experts.
arXiv Detail & Related papers (2024-04-08T16:46:07Z) - Diffusing Colors: Image Colorization with Text Guided Diffusion [11.727899027933466]
We present a novel image colorization framework that utilizes image diffusion techniques with granular text prompts.
Our method provides a balance between automation and control, outperforming existing techniques in terms of visual quality and semantic coherence.
Our approach holds potential particularly for color enhancement and historical image colorization.
arXiv Detail & Related papers (2023-12-07T08:59:20Z) - DiffColor: Toward High Fidelity Text-Guided Image Colorization with
Diffusion Models [12.897939032560537]
We propose a new method called DiffColor to recover vivid colors conditioned on a prompt text.
We first fine-tune a pre-trained text-to-image model to generate colorized images using a CLIP-based contrastive loss.
Then we try to obtain an optimized text embedding aligning the colorized image and the text prompt, and a fine-tuned diffusion model enabling high-quality image reconstruction.
Our method can produce vivid and diverse colors with a few iterations, and keep the structure and background intact while having colors well-aligned with the target language guidance.
arXiv Detail & Related papers (2023-08-03T09:38:35Z) - UniColor: A Unified Framework for Multi-Modal Colorization with
Transformer [23.581502129504287]
We introduce a two-stage colorization framework for incorporating various conditions into a single model.
In the first stage, multi-modal conditions are converted into a common representation of hint points.
In the second stage, we propose a Transformer-based network composed of Chroma-VQGAN and Hybrid-Transformer to generate diverse and high-quality colorization results conditioned on hint points.
arXiv Detail & Related papers (2022-09-22T17:59:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.