Related papers: Improved Diffusion-based Image Colorization via Piggybacked Models

Improved Diffusion-based Image Colorization via Piggybacked Models

URL: http://arxiv.org/abs/2304.11105v1
Date: Fri, 21 Apr 2023 16:23:24 GMT
Title: Improved Diffusion-based Image Colorization via Piggybacked Models
Authors: Hanyuan Liu, Jinbo Xing, Minshan Xie, Chengze Li, Tien-Tsin Wong
Abstract summary: We introduce a colorization model piggybacking on the existing powerful T2I diffusion model. A diffusion guider is designed to incorporate the pre-trained weights of the latent diffusion model. A lightness-aware VQVAE will then generate the colorized result with pixel-perfect alignment to the given grayscale image.
Score: 19.807766482434563
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Image colorization has been attracting the research interests of the community for decades. However, existing methods still struggle to provide satisfactory colorized results given grayscale images due to a lack of human-like global understanding of colors. Recently, large-scale Text-to-Image (T2I) models have been exploited to transfer the semantic information from the text prompts to the image domain, where text provides a global control for semantic objects in the image. In this work, we introduce a colorization model piggybacking on the existing powerful T2I diffusion model. Our key idea is to exploit the color prior knowledge in the pre-trained T2I diffusion model for realistic and diverse colorization. A diffusion guider is designed to incorporate the pre-trained weights of the latent diffusion model to output a latent color prior that conforms to the visual semantics of the grayscale input. A lightness-aware VQVAE will then generate the colorized result with pixel-perfect alignment to the given grayscale image. Our model can also achieve conditional colorization with additional inputs (e.g. user hints and texts). Extensive experiments show that our method achieves state-of-the-art performance in terms of perceptual quality.

Related papers

Leveraging the Powerful Attention of a Pre-trained Diffusion Model for Exemplar-based Image Colorization [4.233370898095789]
Exemplar-based image colorization aims to colorize a grayscale image using a reference color image.<n>We propose a novel, fine-tuning-free approach based on a pre-trained diffusion model.<n>Our experimental results demonstrate that our method outperforms existing techniques in terms of image quality and fidelity to the reference.
arXiv Detail & Related papers (2025-05-21T17:59:40Z)
Free-Lunch Color-Texture Disentanglement for Stylized Image Generation [58.406368812760256]
This paper introduces the first tuning-free approach to achieve free-lunch color-texture disentanglement in stylized T2I generation. We develop techniques for separating and extracting Color-Texture Embeddings (CTE) from individual color and texture reference images. To ensure that the color palette of the generated image aligns closely with the color reference, we apply a whitening and coloring transformation.
arXiv Detail & Related papers (2025-03-18T14:10:43Z)
Leveraging Semantic Attribute Binding for Free-Lunch Color Control in Diffusion Models [53.73253164099701]
We introduce ColorWave, a training-free approach that achieves exact RGB-level color control in diffusion models without fine-tuning. We demonstrate that ColorWave establishes a new paradigm for structured, color-consistent diffusion-based image synthesis.
arXiv Detail & Related papers (2025-03-12T21:49:52Z)
ColorFlow: Retrieval-Augmented Image Sequence Colorization [65.93834649502898]
We propose a three-stage diffusion-based framework tailored for image sequence colorization in industrial applications. Unlike existing methods that require per-ID finetuning or explicit ID embedding extraction, we propose a novel Retrieval Augmented Colorization pipeline. Our pipeline also features a dual-branch design: one branch for color identity extraction and the other for colorization.
arXiv Detail & Related papers (2024-12-16T14:32:49Z)
PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference [62.72779589895124]
We make the first attempt to align diffusion models for image inpainting with human aesthetic standards via a reinforcement learning framework. We train a reward model with a dataset we construct, consisting of nearly 51,000 images annotated with human preferences. Experiments on inpainting comparison and downstream tasks, such as image extension and 3D reconstruction, demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-29T11:49:39Z)
Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior [15.188673173327658]
We leverage the extraordinary generative ability of the diffusion prior to synthesize color with plausible semantics. We adopt multimodal high-level semantic priors to help the model understand the image content and deliver saturated colors. A luminance-aware decoder is designed to restore details and enhance overall visual quality.
arXiv Detail & Related papers (2024-04-25T15:28:22Z)
Direct Consistency Optimization for Compositional Text-to-Image Personalization [73.94505688626651]
Text-to-image (T2I) diffusion models, when fine-tuned on a few personal images, are able to generate visuals with a high degree of consistency. We propose to fine-tune the T2I model by maximizing consistency to reference images, while penalizing the deviation from the pretrained model.
arXiv Detail & Related papers (2024-02-19T09:52:41Z)
Control Color: Multimodal Diffusion-based Interactive Image Colorization [81.68817300796644]
Control Color (Ctrl Color) is a multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model. We present an effective way to encode user strokes to enable precise local color manipulation. We also introduce a novel module based on self-attention and a content-guided deformable autoencoder to address the long-standing issues of color overflow and inaccurate coloring.
arXiv Detail & Related papers (2024-02-16T17:51:13Z)
Incorporating Ensemble and Transfer Learning For An End-To-End Auto-Colorized Image Detection Model [0.0]
This paper presents a novel approach that combines the advantages of transfer and ensemble learning approaches to help reduce training time and resource requirements. The proposed model shows promising results, with accuracy ranging from 94.55% to 99.13%.
arXiv Detail & Related papers (2023-09-25T19:22:57Z)
DiffColor: Toward High Fidelity Text-Guided Image Colorization with Diffusion Models [12.897939032560537]
We propose a new method called DiffColor to recover vivid colors conditioned on a prompt text. We first fine-tune a pre-trained text-to-image model to generate colorized images using a CLIP-based contrastive loss. Then we try to obtain an optimized text embedding aligning the colorized image and the text prompt, and a fine-tuned diffusion model enabling high-quality image reconstruction. Our method can produce vivid and diverse colors with a few iterations, and keep the structure and background intact while having colors well-aligned with the target language guidance.
arXiv Detail & Related papers (2023-08-03T09:38:35Z)
L-CAD: Language-based Colorization with Any-level Descriptions using Diffusion Priors [62.80068955192816]
We propose a unified model to perform language-based colorization with any-level descriptions. We leverage the pretrained cross-modality generative model for its robust language understanding and rich color priors. With the proposed novel sampling strategy, our model achieves instance-aware colorization in diverse and complex scenarios.
arXiv Detail & Related papers (2023-05-24T14:57:42Z)
TIC: Text-Guided Image Colorization [24.317541784957285]
We propose a novel deep network that takes two inputs (the grayscale image and the respective encoded text description) and tries to predict the relevant color gamut. As the respective textual descriptions contain color information of the objects present in the scene, the text encoding helps to improve the overall quality of the predicted colors. We have evaluated our proposed model using different metrics and found that it outperforms the state-of-the-art colorization algorithms both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-08-04T18:40:20Z)
Color2Style: Real-Time Exemplar-Based Image Colorization with Self-Reference Learning and Deep Feature Modulation [29.270149925368674]
We present a deep exemplar-based image colorization approach named Color2Style to resurrect grayscale image media by filling them with vibrant colors. Our method exploits a simple yet effective deep feature modulation (DFM) module, which injects the color embeddings extracted from the reference image into the deep representations of the input grayscale image.
arXiv Detail & Related papers (2021-06-15T10:05:58Z)
Semantic-driven Colorization [78.88814849391352]
Recent colorization works implicitly predict the semantic information while learning to colorize black-and-white images. In this study, we simulate that human-like action to let our network first learn to understand the photo, then colorize it.
arXiv Detail & Related papers (2020-06-13T08:13:30Z)
Instance-aware Image Colorization [51.12040118366072]
In this paper, we propose a method for achieving instance-aware colorization. Our network architecture leverages an off-the-shelf object detector to obtain cropped object images. We use a similar network to extract the full-image features and apply a fusion module to predict the final colors.
arXiv Detail & Related papers (2020-05-21T17:59:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.