SVCNet: Scribble-based Video Colorization Network with Temporal
Aggregation
- URL: http://arxiv.org/abs/2303.11591v2
- Date: Fri, 4 Aug 2023 14:15:39 GMT
- Title: SVCNet: Scribble-based Video Colorization Network with Temporal
Aggregation
- Authors: Yuzhi Zhao, Lai-Man Po, Kangcheng Liu, Xuehui Wang, Wing-Yin Yu,
Pengfei Xian, Yujia Zhang, Mengyang Liu
- Abstract summary: SVCNet can colorize monochrome videos based on different user-given color scribbles.
It addresses three common issues in the scribble-based video colorization area: colorization vividness, temporal consistency, and color bleeding.
The experimental results demonstrate that SVCNet produces both higher-quality and more temporally consistent videos.
- Score: 19.566913227894997
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a scribble-based video colorization network with
temporal aggregation called SVCNet. It can colorize monochrome videos based on
different user-given color scribbles. It addresses three common issues in the
scribble-based video colorization area: colorization vividness, temporal
consistency, and color bleeding. To improve the colorization quality and
strengthen the temporal consistency, we adopt two sequential sub-networks in
SVCNet for precise colorization and temporal smoothing, respectively. The first
stage includes a pyramid feature encoder to incorporate color scribbles with a
grayscale frame, and a semantic feature encoder to extract semantics. The
second stage finetunes the output from the first stage by aggregating the
information of neighboring colorized frames (as short-range connections) and
the first colorized frame (as a long-range connection). To alleviate the color
bleeding artifacts, we learn video colorization and segmentation
simultaneously. Furthermore, we set the majority of operations on a fixed small
image resolution and use a Super-resolution Module at the tail of SVCNet to
recover original sizes. It allows the SVCNet to fit different image resolutions
at the inference. Finally, we evaluate the proposed SVCNet on DAVIS and Videvo
benchmarks. The experimental results demonstrate that SVCNet produces both
higher-quality and more temporally consistent videos than other well-known
video colorization approaches. The codes and models can be found at
https://github.com/zhaoyuzhi/SVCNet.
Related papers
- L-C4: Language-Based Video Colorization for Creative and Consistent Color [59.069498113050436]
We present Language-based video colorization for Creative and Consistent Colors (L-C4)
Our model is built upon a pre-trained cross-modality generative model.
We propose temporally deformable attention to prevent flickering or color shifts, and cross-clip fusion to maintain long-term color consistency.
arXiv Detail & Related papers (2024-10-07T12:16:21Z) - Improving Video Colorization by Test-Time Tuning [79.67548221384202]
We propose an effective method, which aims to enhance video colorization through test-time tuning.
By exploiting the reference to construct additional training samples during testing, our approach achieves a performance boost of 13 dB in PSNR on average.
arXiv Detail & Related papers (2023-06-25T05:36:40Z) - FlowChroma -- A Deep Recurrent Neural Network for Video Colorization [1.0499611180329804]
We develop an automated video colorization framework that minimizes the flickering of colors across frames.
We show that recurrent neural networks can be successfully used to improve color consistency in video colorization.
arXiv Detail & Related papers (2023-05-23T05:41:53Z) - Temporal Consistent Automatic Video Colorization via Semantic
Correspondence [12.107878178519128]
We propose a novel video colorization framework, which combines semantic correspondence into automatic video colorization.
In the NTIRE 2023 Video Colorization Challenge, our method ranks at the 3rd place in Color Distribution Consistency (CDC) Optimization track.
arXiv Detail & Related papers (2023-05-13T12:06:09Z) - BiSTNet: Semantic Image Prior Guided Bidirectional Temporal Feature
Fusion for Deep Exemplar-based Video Colorization [70.14893481468525]
We present an effective BiSTNet to explore colors of reference exemplars and utilize them to help video colorization.
We first establish the semantic correspondence between each frame and the reference exemplars in deep feature space to explore color information from reference exemplars.
We develop a mixed expert block to extract semantic information for modeling the object boundaries of frames so that the semantic image prior can better guide the colorization process.
arXiv Detail & Related papers (2022-12-05T13:47:15Z) - Temporally Consistent Video Colorization with Deep Feature Propagation
and Self-regularization Learning [90.38674162878496]
We propose a novel temporally consistent video colorization framework (TCVC)
TCVC effectively propagates frame-level deep features in a bidirectional way to enhance the temporal consistency of colorization.
Experiments demonstrate that our method can not only obtain visually pleasing colorized video, but also achieve clearly better temporal consistency than state-of-the-art methods.
arXiv Detail & Related papers (2021-10-09T13:00:14Z) - End-to-End Dense Video Captioning with Parallel Decoding [53.34238344647624]
We propose a simple yet effective framework for end-to-end dense video captioning with parallel decoding (PDVC)
PDVC precisely segments the video into a number of event pieces under the holistic understanding of the video content.
experiments on ActivityNet Captions and YouCook2 show that PDVC is capable of producing high-quality captioning results.
arXiv Detail & Related papers (2021-08-17T17:39:15Z) - Video Abnormal Event Detection by Learning to Complete Visual Cloze
Tests [50.1446994599891]
Video abnormal event (VAD) is a vital semi-supervised task that requires learning with only roughly labeled normal videos.
We propose a novel approach named visual cloze (VCC) which performs VAD by learning to complete "visual cloze tests" (VCTs)
We show that VCC achieves state-of-the-art VAD performance.
arXiv Detail & Related papers (2021-08-05T04:05:36Z) - VCGAN: Video Colorization with Hybrid Generative Adversarial Network [22.45196398040388]
Hybrid Video Colorization with Hybrid Generative Adversarative Network (VCGAN) is an improved approach to colorization using end-to-end learning.
Experimental results demonstrate that VCGAN produces higher-quality and temporally more consistent colorful videos than existing approaches.
arXiv Detail & Related papers (2021-04-26T05:50:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.