Learned Video Compression for YUV 4:2:0 Content Using Flow-based
Conditional Inter-frame Coding
- URL: http://arxiv.org/abs/2210.08225v1
- Date: Sat, 15 Oct 2022 08:36:01 GMT
- Title: Learned Video Compression for YUV 4:2:0 Content Using Flow-based
Conditional Inter-frame Coding
- Authors: Yung-Han Ho, Chih-Hsuan Lin, Peng-Yu Chen, Mu-Jung Chen, Chih-Peng
Chang, Wen-Hsiao Peng, Hsueh-Ming Hang
- Abstract summary: This paper proposes a learning-based video compression framework for variable-rate coding on YUV 4:2:0 content.
We introduce a conditional flow-based inter-frame coder to improve the interframe coding efficiency.
Experimental results show that our model performs better than x265 on UVG and MCL-JCV datasets.
- Score: 24.031385522441497
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper proposes a learning-based video compression framework for
variable-rate coding on YUV 4:2:0 content. Most existing learning-based video
compression models adopt the traditional hybrid-based coding architecture,
which involves temporal prediction followed by residual coding. However, recent
studies have shown that residual coding is sub-optimal from the
information-theoretic perspective. In addition, most existing models are
optimized with respect to RGB content. Furthermore, they require separate
models for variable-rate coding. To address these issues, this work presents an
attempt to incorporate the conditional inter-frame coding for YUV 4:2:0
content. We introduce a conditional flow-based inter-frame coder to improve the
inter-frame coding efficiency. To adapt our codec to YUV 4:2:0 content, we
adopt a simple strategy of using space-to-depth and depth-to-space conversions.
Lastly, we employ a rate-adaption net to achieve variable-rate coding without
training multiple models. Experimental results show that our model performs
better than x265 on UVG and MCL-JCV datasets in terms of PSNR-YUV. However, on
the more challenging datasets from ISCAS'22 GC, there is still ample room for
improvement. This insufficient performance is due to the lack of inter-frame
coding capability at a large GOP size and can be mitigated by increasing the
model capacity and applying an error propagation-aware training strategy.
Related papers
- When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding [112.44822009714461]
Cross-Modality Video Coding (CMVC) is a pioneering approach to explore multimodality representation and video generative models in video coding.
During decoding, previously encoded components and video generation models are leveraged to create multiple encoding-decoding modes.
Experiments indicate that TT2V achieves effective semantic reconstruction, while IT2V exhibits competitive perceptual consistency.
arXiv Detail & Related papers (2024-08-15T11:36:18Z) - Hierarchical Patch Diffusion Models for High-Resolution Video Generation [50.42746357450949]
We develop deep context fusion, which propagates context information from low-scale to high-scale patches in a hierarchical manner.
We also propose adaptive computation, which allocates more network capacity and computation towards coarse image details.
The resulting model sets a new state-of-the-art FVD score of 66.32 and Inception Score of 87.68 in class-conditional video generation.
arXiv Detail & Related papers (2024-06-12T01:12:53Z) - Hierarchical B-frame Video Coding Using Two-Layer CANF without Motion
Coding [17.998825368770635]
We propose a novel B-frame coding architecture based on two-layer Augmented Normalization Flows (CANF)
Our proposed idea of video compression without motion coding offers a new direction for learned video coding.
The rate-distortion performance of our scheme is slightly lower than that of the state-of-the-art learned B-frame coding scheme, B-CANF, but outperforms other learned B-frame coding schemes.
arXiv Detail & Related papers (2023-04-05T18:36:28Z) - Learned Hierarchical B-frame Coding with Adaptive Feature Modulation for
YUV 4:2:0 Content [13.289507865388863]
This paper introduces a learned hierarchical B-frame coding scheme in response to the Grand Challenge on Neural Network-based Video Coding at ISCAS 2023.
We address specifically three issues, including (1) B-frame coding, (2) YUV 4:2:0 coding, and (3) content-adaptive variable-rate coding with only one single model.
arXiv Detail & Related papers (2022-12-29T06:22:52Z) - CANF-VC: Conditional Augmented Normalizing Flows for Video Compression [81.41594331948843]
CANF-VC is an end-to-end learning-based video compression system.
It is based on conditional augmented normalizing flows (ANF)
arXiv Detail & Related papers (2022-07-12T04:53:24Z) - A Coding Framework and Benchmark towards Low-Bitrate Video Understanding [63.05385140193666]
We propose a traditional-neural mixed coding framework that takes advantage of both traditional codecs and neural networks (NNs)
The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved.
We build a low-bitrate video understanding benchmark with three downstream tasks on eight datasets, demonstrating the notable superiority of our approach.
arXiv Detail & Related papers (2022-02-06T16:29:15Z) - A Combined Deep Learning based End-to-End Video Coding Architecture for
YUV Color Space [14.685161934404123]
Most of the existing deep learning based end-to-end video coding (DLEC) architectures are designed specifically for RGB color format.
This paper introduces a new DLEC architecture for video coding to effectively support YUV 4:2:0 and compares its performance against the HEVC standard.
arXiv Detail & Related papers (2021-04-01T23:41:06Z) - Transform Network Architectures for Deep Learning based End-to-End
Image/Video Coding in Subsampled Color Spaces [16.83399026040147]
This paper investigates various DLEC designs to support YUV 4:2:0 format.
A new transform network architecture is proposed to improve the efficiency of coding YUV 4:2:0 data.
arXiv Detail & Related papers (2021-02-27T06:47:27Z) - Conditional Entropy Coding for Efficient Video Compression [82.35389813794372]
We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.
We first show that a simple architecture modeling the entropy between the image latent codes is as competitive as other neural video compression works and video codecs.
We then propose a novel internal learning extension on top of this architecture that brings an additional 10% savings without trading off decoding speed.
arXiv Detail & Related papers (2020-08-20T20:01:59Z) - Variable Rate Video Compression using a Hybrid Recurrent Convolutional
Learning Framework [1.9290392443571382]
This paper presents PredEncoder, a hybrid video compression framework based on the concept of predictive auto-encoding.
A variable-rate block encoding scheme has been proposed in the paper that leads to remarkably high quality to bit-rate ratios.
arXiv Detail & Related papers (2020-04-08T20:49:25Z) - Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system.
Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame.
Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.