Related papers: TIC: Text-Guided Image Colorization

TIC: Text-Guided Image Colorization

URL: http://arxiv.org/abs/2208.02843v1
Date: Thu, 4 Aug 2022 18:40:20 GMT
Title: TIC: Text-Guided Image Colorization
Authors: Subhankar Ghosh, Prasun Roy, Saumik Bhattacharya, Umapada Pal, Michael Blumenstein
Abstract summary: We propose a novel deep network that takes two inputs (the grayscale image and the respective encoded text description) and tries to predict the relevant color gamut. As the respective textual descriptions contain color information of the objects present in the scene, the text encoding helps to improve the overall quality of the predicted colors. We have evaluated our proposed model using different metrics and found that it outperforms the state-of-the-art colorization algorithms both qualitatively and quantitatively.
Score: 24.317541784957285
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Image colorization is a well-known problem in computer vision. However, due to the ill-posed nature of the task, image colorization is inherently challenging. Though several attempts have been made by researchers to make the colorization pipeline automatic, these processes often produce unrealistic results due to a lack of conditioning. In this work, we attempt to integrate textual descriptions as an auxiliary condition, along with the grayscale image that is to be colorized, to improve the fidelity of the colorization process. To the best of our knowledge, this is one of the first attempts to incorporate textual conditioning in the colorization pipeline. To do so, we have proposed a novel deep network that takes two inputs (the grayscale image and the respective encoded text description) and tries to predict the relevant color gamut. As the respective textual descriptions contain color information of the objects present in the scene, the text encoding helps to improve the overall quality of the predicted colors. We have evaluated our proposed model using different metrics and found that it outperforms the state-of-the-art colorization algorithms both qualitatively and quantitatively.

Related papers

Automatic Controllable Colorization via Imagination [55.489416987587305]
We propose a framework for automatic colorization that allows for iterative editing and modifications. By understanding the content within a grayscale image, we utilize a pre-trained image generation model to generate multiple images that contain the same content. These images serve as references for coloring, mimicking the process of human experts.
arXiv Detail & Related papers (2024-04-08T16:46:07Z)
Control Color: Multimodal Diffusion-based Interactive Image Colorization [81.68817300796644]
Control Color (Ctrl Color) is a multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model. We present an effective way to encode user strokes to enable precise local color manipulation. We also introduce a novel module based on self-attention and a content-guided deformable autoencoder to address the long-standing issues of color overflow and inaccurate coloring.
arXiv Detail & Related papers (2024-02-16T17:51:13Z)
Audio-Infused Automatic Image Colorization by Exploiting Audio Scene Semantics [54.980359694044566]
This paper tries to utilize corresponding audio, which naturally contains extra semantic information about the same scene. Experiments demonstrate that audio guidance can effectively improve the performance of automatic colorization.
arXiv Detail & Related papers (2024-01-24T07:22:05Z)
Diffusing Colors: Image Colorization with Text Guided Diffusion [11.727899027933466]
We present a novel image colorization framework that utilizes image diffusion techniques with granular text prompts. Our method provides a balance between automation and control, outperforming existing techniques in terms of visual quality and semantic coherence. Our approach holds potential particularly for color enhancement and historical image colorization.
arXiv Detail & Related papers (2023-12-07T08:59:20Z)
DiffColor: Toward High Fidelity Text-Guided Image Colorization with Diffusion Models [12.897939032560537]
We propose a new method called DiffColor to recover vivid colors conditioned on a prompt text. We first fine-tune a pre-trained text-to-image model to generate colorized images using a CLIP-based contrastive loss. Then we try to obtain an optimized text embedding aligning the colorized image and the text prompt, and a fine-tuned diffusion model enabling high-quality image reconstruction. Our method can produce vivid and diverse colors with a few iterations, and keep the structure and background intact while having colors well-aligned with the target language guidance.
arXiv Detail & Related papers (2023-08-03T09:38:35Z)
MMC: Multi-Modal Colorization of Images using Textual Descriptions [22.666387184216678]
We propose a deep network that takes two inputs (grayscale image and the respective encoded text description) and tries to predict the relevant color components. Also, we have predicted each object in the image and have colorized them with their individual description to incorporate their specific attributes in the colorization process. In terms of performance, the proposed method outperforms existing colorization techniques in terms of LPIPS, PSNR and SSIM metrics.
arXiv Detail & Related papers (2023-04-24T10:53:13Z)
Improved Diffusion-based Image Colorization via Piggybacked Models [19.807766482434563]
We introduce a colorization model piggybacking on the existing powerful T2I diffusion model. A diffusion guider is designed to incorporate the pre-trained weights of the latent diffusion model. A lightness-aware VQVAE will then generate the colorized result with pixel-perfect alignment to the given grayscale image.
arXiv Detail & Related papers (2023-04-21T16:23:24Z)
Semantic-Sparse Colorization Network for Deep Exemplar-based Colorization [23.301799487207035]
Exemplar-based colorization approaches rely on reference image to provide plausible colors for target gray-scale image. We propose Semantic-Sparse Colorization Network (SSCN) to transfer both the global image style and semantic-related colors to the gray-scale image. Our network can perfectly balance the global and local colors while alleviating the ambiguous matching problem.
arXiv Detail & Related papers (2021-12-02T15:35:10Z)
Image Colorization: A Survey and Dataset [94.59768013860668]
This article presents a comprehensive survey of state-of-the-art deep learning-based image colorization techniques. It categorizes the existing colorization techniques into seven classes and discusses important factors governing their performance. We perform an extensive experimental evaluation of existing image colorization methods using both existing datasets and our proposed one.
arXiv Detail & Related papers (2020-08-25T01:22:52Z)
Semantic-driven Colorization [78.88814849391352]
Recent colorization works implicitly predict the semantic information while learning to colorize black-and-white images. In this study, we simulate that human-like action to let our network first learn to understand the photo, then colorize it.
arXiv Detail & Related papers (2020-06-13T08:13:30Z)
Instance-aware Image Colorization [51.12040118366072]
In this paper, we propose a method for achieving instance-aware colorization. Our network architecture leverages an off-the-shelf object detector to obtain cropped object images. We use a similar network to extract the full-image features and apply a fusion module to predict the final colors.
arXiv Detail & Related papers (2020-05-21T17:59:23Z)
Learning to Structure an Image with Few Colors [59.34619548026885]
We propose a color quantization network, ColorCNN, which learns to structure the images from the classification loss in an end-to-end manner. With only a 1-bit color space (i.e., two colors), the proposed network achieves 82.1% top-1 accuracy on the CIFAR10 dataset. For applications, when encoded with PNG, the proposed color quantization shows superiority over other image compression methods in the extremely low bit-rate regime.
arXiv Detail & Related papers (2020-03-17T17:56:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.