Temporally Consistent Video Colorization with Deep Feature Propagation
and Self-regularization Learning
- URL: http://arxiv.org/abs/2110.04562v1
- Date: Sat, 9 Oct 2021 13:00:14 GMT
- Title: Temporally Consistent Video Colorization with Deep Feature Propagation
and Self-regularization Learning
- Authors: Yihao Liu and Hengyuan Zhao and Kelvin C.K. Chan and Xintao Wang and
Chen Change Loy and Yu Qiao and Chao Dong
- Abstract summary: We propose a novel temporally consistent video colorization framework (TCVC)
TCVC effectively propagates frame-level deep features in a bidirectional way to enhance the temporal consistency of colorization.
Experiments demonstrate that our method can not only obtain visually pleasing colorized video, but also achieve clearly better temporal consistency than state-of-the-art methods.
- Score: 90.38674162878496
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video colorization is a challenging and highly ill-posed problem. Although
recent years have witnessed remarkable progress in single image colorization,
there is relatively less research effort on video colorization and existing
methods always suffer from severe flickering artifacts (temporal inconsistency)
or unsatisfying colorization performance. We address this problem from a new
perspective, by jointly considering colorization and temporal consistency in a
unified framework. Specifically, we propose a novel temporally consistent video
colorization framework (TCVC). TCVC effectively propagates frame-level deep
features in a bidirectional way to enhance the temporal consistency of
colorization. Furthermore, TCVC introduces a self-regularization learning (SRL)
scheme to minimize the prediction difference obtained with different time
steps. SRL does not require any ground-truth color videos for training and can
further improve temporal consistency. Experiments demonstrate that our method
can not only obtain visually pleasing colorized video, but also achieve clearly
better temporal consistency than state-of-the-art methods.
Related papers
- L-C4: Language-Based Video Colorization for Creative and Consistent Color [59.069498113050436]
We present Language-based video colorization for Creative and Consistent Colors (L-C4)
Our model is built upon a pre-trained cross-modality generative model.
We propose temporally deformable attention to prevent flickering or color shifts, and cross-clip fusion to maintain long-term color consistency.
arXiv Detail & Related papers (2024-10-07T12:16:21Z) - LatentColorization: Latent Diffusion-Based Speaker Video Colorization [1.2641141743223379]
We introduce a novel solution for achieving temporal consistency in video colorization.
We demonstrate strong improvements on established image quality metrics compared to other existing methods.
Our dataset encompasses a combination of conventional datasets and videos from television/movies.
arXiv Detail & Related papers (2024-05-09T12:06:06Z) - Control Color: Multimodal Diffusion-based Interactive Image Colorization [81.68817300796644]
Control Color (Ctrl Color) is a multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model.
We present an effective way to encode user strokes to enable precise local color manipulation.
We also introduce a novel module based on self-attention and a content-guided deformable autoencoder to address the long-standing issues of color overflow and inaccurate coloring.
arXiv Detail & Related papers (2024-02-16T17:51:13Z) - Histogram-guided Video Colorization Structure with Spatial-Temporal
Connection [10.059070138875038]
Histogram-guided Video Colorization with Spatial-Temporal connection structure (named ST-HVC)
To fully exploit the chroma and motion information, the joint flow and histogram module is tailored to integrate the histogram and flow features.
We show that the developed method achieves excellent performance both quantitatively and qualitatively in two video datasets.
arXiv Detail & Related papers (2023-08-09T11:59:18Z) - Improving Video Colorization by Test-Time Tuning [79.67548221384202]
We propose an effective method, which aims to enhance video colorization through test-time tuning.
By exploiting the reference to construct additional training samples during testing, our approach achieves a performance boost of 13 dB in PSNR on average.
arXiv Detail & Related papers (2023-06-25T05:36:40Z) - Video Colorization with Pre-trained Text-to-Image Diffusion Models [19.807766482434563]
We present ColorDiffuser, an adaptation of a pre-trained text-to-image latent diffusion model for video colorization.
We propose two novel techniques to enhance the temporal coherence and maintain the vividness of colorization across frames.
arXiv Detail & Related papers (2023-06-02T17:58:00Z) - Temporal Consistent Automatic Video Colorization via Semantic
Correspondence [12.107878178519128]
We propose a novel video colorization framework, which combines semantic correspondence into automatic video colorization.
In the NTIRE 2023 Video Colorization Challenge, our method ranks at the 3rd place in Color Distribution Consistency (CDC) Optimization track.
arXiv Detail & Related papers (2023-05-13T12:06:09Z) - BiSTNet: Semantic Image Prior Guided Bidirectional Temporal Feature
Fusion for Deep Exemplar-based Video Colorization [70.14893481468525]
We present an effective BiSTNet to explore colors of reference exemplars and utilize them to help video colorization.
We first establish the semantic correspondence between each frame and the reference exemplars in deep feature space to explore color information from reference exemplars.
We develop a mixed expert block to extract semantic information for modeling the object boundaries of frames so that the semantic image prior can better guide the colorization process.
arXiv Detail & Related papers (2022-12-05T13:47:15Z) - Blind Video Temporal Consistency via Deep Video Prior [61.062900556483164]
We present a novel and general approach for blind video temporal consistency.
Our method is only trained on a pair of original and processed videos directly.
We show that temporal consistency can be achieved by training a convolutional network on a video with the Deep Video Prior.
arXiv Detail & Related papers (2020-10-22T16:19:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.