Hierarchical B-frame Video Coding for Long Group of Pictures
- URL: http://arxiv.org/abs/2406.16544v1
- Date: Mon, 24 Jun 2024 11:29:52 GMT
- Title: Hierarchical B-frame Video Coding for Long Group of Pictures
- Authors: Ivan Kirillov, Denis Parkhomenko, Kirill Chernyshev, Alexander Pletnev, Yibo Shi, Kai Lin, Dmitry Babin,
- Abstract summary: We present an end-to-end learned video for random access that combines training on long sequences of frames, rate allocation and content adaptation on inference.
Under common test conditions, it achieves results comparable to VTM in terms of YUV-PSNR BD-Rate on some classes of videos.
On average it surpasses open LD and RA end-to-end solutions in terms of VMAF and YUV BD-Rates.
- Score: 42.229439873835254
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Learned video compression methods already outperform VVC in the low-delay (LD) case, but the random-access (RA) scenario remains challenging. Most works on learned RA video compression either use HEVC as an anchor or compare it to VVC in specific test conditions, using RGB-PSNR metric instead of Y-PSNR and avoiding comprehensive evaluation. Here, we present an end-to-end learned video codec for random access that combines training on long sequences of frames, rate allocation designed for hierarchical coding and content adaptation on inference. We show that under common test conditions (JVET-CTC), it achieves results comparable to VTM (VVC reference software) in terms of YUV-PSNR BD-Rate on some classes of videos, and outperforms it on almost all test sets in terms of VMAF BD-Rate. On average it surpasses open LD and RA end-to-end solutions in terms of VMAF and YUV BD-Rates.
Related papers
- On the Computation of BD-Rate over a Set of Videos for Fair Assessment of Performance of Learned Video Codecs [7.714092783675679]
The Bjontegaard Delta (BD) measure is widely employed to evaluate and quantify the variations in the rate-distortion(RD) performance across different codecs.
We claim that the current practice in the learned video compression community of computing the average BD value over a dataset based on the average RD curve of multiple videos can lead to misleading conclusions.
arXiv Detail & Related papers (2024-09-13T12:30:15Z) - NVRC: Neural Video Representation Compression [13.131842990481038]
We propose a novel INR-based video compression framework, Neural Video Representation Compression (NVRC)
NVRC, for the first time, is able to optimize an INR-based video in a fully end-to-end manner.
Our experiments show that NVRC outperforms many conventional and learning-based benchmark entropy.
arXiv Detail & Related papers (2024-09-11T16:57:12Z) - Benchmarking Conventional and Learned Video Codecs with a Low-Delay Configuration [11.016119119250765]
This paper conducts a comparative study of state-of-the-art conventional and learned video coding methods based on a low delay configuration.
To allow a fair and meaningful comparison, the evaluation was performed on test sequences defined in the AOM and MPEG common test conditions in the YCbCr 4:2:0 color space.
The evaluation results show that the JVET ECM codecs offer the best overall coding performance among all codecs tested.
arXiv Detail & Related papers (2024-08-09T12:55:23Z) - HiNeRV: Video Compression with Hierarchical Encoding-based Neural
Representation [14.088444622391501]
Implicit Representations (INRs) have previously been used to represent and compress image and video content.
Existing INR-based methods have failed to deliver rate quality performance comparable with the state of the art in video compression.
We propose HiNeRV, an INR that combines light weight layers with hierarchical positional encodings.
arXiv Detail & Related papers (2023-06-16T12:59:52Z) - CANF-VC: Conditional Augmented Normalizing Flows for Video Compression [81.41594331948843]
CANF-VC is an end-to-end learning-based video compression system.
It is based on conditional augmented normalizing flows (ANF)
arXiv Detail & Related papers (2022-07-12T04:53:24Z) - Efficient VVC Intra Prediction Based on Deep Feature Fusion and
Probability Estimation [57.66773945887832]
We propose to optimize Versatile Video Coding (VVC) complexity at intra-frame prediction, with a two-stage framework of deep feature fusion and probability estimation.
Experimental results on standard database demonstrate the superiority of proposed method, especially for High Definition (HD) and Ultra-HD (UHD) video sequences.
arXiv Detail & Related papers (2022-05-07T08:01:32Z) - End-to-End Rate-Distortion Optimized Learned Hierarchical Bi-Directional
Video Compression [10.885590093103344]
Learned VC allows end-to-end rate-distortion (R-D) optimized training of nonlinear transform, motion and entropy model simultaneously.
This paper proposes a learned hierarchical bi-directional video (LHBDC) that combines the benefits of hierarchical motion-sampling and end-to-end optimization.
arXiv Detail & Related papers (2021-12-17T14:30:22Z) - Perceptual Learned Video Compression with Recurrent Conditional GAN [158.0726042755]
We propose a Perceptual Learned Video Compression (PLVC) approach with recurrent conditional generative adversarial network.
PLVC learns to compress video towards good perceptual quality at low bit-rate.
The user study further validates the outstanding perceptual performance of PLVC in comparison with the latest learned video compression approaches.
arXiv Detail & Related papers (2021-09-07T13:36:57Z) - DynaVSR: Dynamic Adaptive Blind Video Super-Resolution [60.154204107453914]
DynaVSR is a novel meta-learning-based framework for real-world video SR.
We train a multi-frame downscaling module with various types of synthetic blur kernels, which is seamlessly combined with a video SR network for input-aware adaptation.
Experimental results show that DynaVSR consistently improves the performance of the state-of-the-art video SR models by a large margin.
arXiv Detail & Related papers (2020-11-09T15:07:32Z) - Learning for Video Compression with Recurrent Auto-Encoder and Recurrent
Probability Model [164.7489982837475]
This paper proposes a Recurrent Learned Video Compression (RLVC) approach with the Recurrent Auto-Encoder (RAE) and Recurrent Probability Model ( RPM)
The RAE employs recurrent cells in both the encoder and decoder to exploit the temporal correlation among video frames.
Our approach achieves the state-of-the-art learned video compression performance in terms of both PSNR and MS-SSIM.
arXiv Detail & Related papers (2020-06-24T08:46:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.