Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior
- URL: http://arxiv.org/abs/2404.16678v1
- Date: Thu, 25 Apr 2024 15:28:22 GMT
- Title: Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior
- Authors: Han Wang, Xinning Chai, Yiwen Wang, Yuhong Zhang, Rong Xie, Li Song,
- Abstract summary: We leverage the extraordinary generative ability of the diffusion prior to synthesize color with plausible semantics.
We adopt multimodal high-level semantic priors to help the model understand the image content and deliver saturated colors.
A luminance-aware decoder is designed to restore details and enhance overall visual quality.
- Score: 15.188673173327658
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Colorizing grayscale images offers an engaging visual experience. Existing automatic colorization methods often fail to generate satisfactory results due to incorrect semantic colors and unsaturated colors. In this work, we propose an automatic colorization pipeline to overcome these challenges. We leverage the extraordinary generative ability of the diffusion prior to synthesize color with plausible semantics. To overcome the artifacts introduced by the diffusion prior, we apply the luminance conditional guidance. Moreover, we adopt multimodal high-level semantic priors to help the model understand the image content and deliver saturated colors. Besides, a luminance-aware decoder is designed to restore details and enhance overall visual quality. The proposed pipeline synthesizes saturated colors while maintaining plausible semantics. Experiments indicate that our proposed method considers both diversity and fidelity, surpassing previous methods in terms of perceptual realism and gain most human preference.
Related papers
- Automatic Controllable Colorization via Imagination [55.489416987587305]
We propose a framework for automatic colorization that allows for iterative editing and modifications.
By understanding the content within a grayscale image, we utilize a pre-trained image generation model to generate multiple images that contain the same content.
These images serve as references for coloring, mimicking the process of human experts.
arXiv Detail & Related papers (2024-04-08T16:46:07Z) - Control Color: Multimodal Diffusion-based Interactive Image Colorization [81.68817300796644]
Control Color (Ctrl Color) is a multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model.
We present an effective way to encode user strokes to enable precise local color manipulation.
We also introduce a novel module based on self-attention and a content-guided deformable autoencoder to address the long-standing issues of color overflow and inaccurate coloring.
arXiv Detail & Related papers (2024-02-16T17:51:13Z) - Diffusing Colors: Image Colorization with Text Guided Diffusion [11.727899027933466]
We present a novel image colorization framework that utilizes image diffusion techniques with granular text prompts.
Our method provides a balance between automation and control, outperforming existing techniques in terms of visual quality and semantic coherence.
Our approach holds potential particularly for color enhancement and historical image colorization.
arXiv Detail & Related papers (2023-12-07T08:59:20Z) - Improved Diffusion-based Image Colorization via Piggybacked Models [19.807766482434563]
We introduce a colorization model piggybacking on the existing powerful T2I diffusion model.
A diffusion guider is designed to incorporate the pre-trained weights of the latent diffusion model.
A lightness-aware VQVAE will then generate the colorized result with pixel-perfect alignment to the given grayscale image.
arXiv Detail & Related papers (2023-04-21T16:23:24Z) - DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders [19.560271615736212]
DDColor is an end-to-end method with dual decoders for image colorization.
Our approach includes a pixel decoder and a query-based color decoder.
Our two decoders work together to establish correlations between color and multi-scale semantic representations.
arXiv Detail & Related papers (2022-12-22T11:17:57Z) - PalGAN: Image Colorization with Palette Generative Adversarial Networks [51.59276436217957]
We propose a new GAN-based colorization approach PalGAN, integrated with palette estimation and chromatic attention.
PalGAN outperforms state-of-the-arts in quantitative evaluation and visual comparison, delivering notable diverse, contrastive, and edge-preserving appearances.
arXiv Detail & Related papers (2022-10-20T12:28:31Z) - Towards Vivid and Diverse Image Colorization with Generative Color Prior [17.087464490162073]
Recent deep-learning-based methods could automatically colorize images at a low cost.
We aim at recovering vivid colors by leveraging the rich and diverse color priors encapsulated in a pretrained Generative Adversarial Networks (GAN)
Thanks to the powerful generative color prior and delicate designs, our method could produce vivid colors with a single forward pass.
arXiv Detail & Related papers (2021-08-19T17:49:21Z) - Guided Colorization Using Mono-Color Image Pairs [6.729108277517129]
monochrome images usually have better signal-to-noise ratio (SNR) and richer textures due to its higher quantum efficiency.
We propose a mono-color image enhancement algorithm that colorizes the monochrome image with the color one.
Experimental results show that, our algorithm can efficiently restore color images with higher SNR and richer details from the mono-color image pairs.
arXiv Detail & Related papers (2021-08-17T07:00:28Z) - Underwater Image Enhancement via Medium Transmission-Guided Multi-Color
Space Embedding [88.46682991985907]
We present an underwater image enhancement network via medium transmission-guided multi-color space embedding, called Ucolor.
Our network can effectively improve the visual quality of underwater images by exploiting multiple color spaces embedding.
arXiv Detail & Related papers (2021-04-27T07:35:30Z) - Degrade is Upgrade: Learning Degradation for Low-light Image Enhancement [52.49231695707198]
We investigate the intrinsic degradation and relight the low-light image while refining the details and color in two steps.
Inspired by the color image formulation, we first estimate the degradation from low-light inputs to simulate the distortion of environment illumination color, and then refine the content to recover the loss of diffuse illumination color.
Our proposed method has surpassed the SOTA by 0.95dB in PSNR on LOL1000 dataset and 3.18% in mAP on ExDark dataset.
arXiv Detail & Related papers (2021-03-19T04:00:27Z) - Semantic-driven Colorization [78.88814849391352]
Recent colorization works implicitly predict the semantic information while learning to colorize black-and-white images.
In this study, we simulate that human-like action to let our network first learn to understand the photo, then colorize it.
arXiv Detail & Related papers (2020-06-13T08:13:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.