UCVC: A Unified Contextual Video Compression Framework with Joint
P-frame and B-frame Coding
- URL: http://arxiv.org/abs/2402.01289v1
- Date: Fri, 2 Feb 2024 10:25:39 GMT
- Title: UCVC: A Unified Contextual Video Compression Framework with Joint
P-frame and B-frame Coding
- Authors: Jiayu Yang, Wei Jiang, Yongqi Zhai, Chunhui Yang, Ronggang Wang
- Abstract summary: This paper presents a learned video compression method in response to video compression track of the 6th Challenge on Learned Image Compression (CLIC)
We propose a unified contextual video compression framework (UCVC) for joint P-frame and B-frame coding.
- Score: 29.44234507064189
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper presents a learned video compression method in response to video
compression track of the 6th Challenge on Learned Image Compression (CLIC), at
DCC 2024.Specifically, we propose a unified contextual video compression
framework (UCVC) for joint P-frame and B-frame coding. Each non-intra frame
refers to two neighboring decoded frames, which can be either both from the
past for P-frame compression, or one from the past and one from the future for
B-frame compression. In training stage, the model parameters are jointly
optimized with both P-frames and B-frames. Benefiting from the designs, the
framework can support both P-frame and B-frame coding and achieve comparable
compression efficiency with that specifically designed for P-frame or
B-frame.As for challenge submission, we report the optimal compression
efficiency by selecting appropriate frame types for each test sequence. Our
team name is PKUSZ-LVC.
Related papers
- Improved Video VAE for Latent Video Diffusion Model [55.818110540710215]
Video Autoencoder (VAE) aims to compress pixel data into low-dimensional latent space, playing an important role in OpenAI's Sora.
Most of existing VAEs inflate a pretrained image VAE into the 3D causal structure for temporal-spatial compression.
We propose a new KTC architecture and a group causal convolution (GCConv) module to further improve video VAE (IV-VAE)
arXiv Detail & Related papers (2024-11-10T12:43:38Z) - IBVC: Interpolation-driven B-frame Video Compression [68.18440522300536]
B-frame video compression aims to adopt bi-directional motion estimation and motion compensation (MEMC) coding for middle frame reconstruction.
Previous learned approaches often directly extend neural P-frame codecs to B-frame relying on bi-directional optical-flow estimation.
We propose a simple yet effective structure called Interpolation-B-frame Video Compression (IBVC) to address these issues.
arXiv Detail & Related papers (2023-09-25T02:45:51Z) - Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation [93.18163456287164]
This paper proposes a novel text-guided video-to-video translation framework to adapt image models to videos.
Our framework achieves global style and local texture temporal consistency at a low cost.
arXiv Detail & Related papers (2023-06-13T17:52:23Z) - Advancing Learned Video Compression with In-loop Frame Prediction [177.67218448278143]
In this paper, we propose an Advanced Learned Video Compression (ALVC) approach with the in-loop frame prediction module.
The predicted frame can serve as a better reference than the previously compressed frame, and therefore it benefits the compression performance.
The experiments show the state-of-the-art performance of our ALVC approach in learned video compression.
arXiv Detail & Related papers (2022-11-13T19:53:14Z) - B-CANF: Adaptive B-frame Coding with Conditional Augmented Normalizing
Flows [11.574465203875342]
This work introduces a novel B-frame coding framework, termed B-CANF, that exploits conditional augmented normalizing flows for B-frame coding.
B-CANF additionally features two novel elements: frame-type adaptive coding and B*-frames.
arXiv Detail & Related papers (2022-09-05T05:28:19Z) - Inter-Frame Compression for Dynamic Point Cloud Geometry Coding [14.79613731546357]
We propose a lossy compression scheme that predicts the latent representation of the current frame using the previous frame.
The proposed network utilizes convolutions with hierarchical multiscale 3D feature learning to encode the current frame.
The proposed method achieves more than 88% BD-Rate (Bjontegaard Delta Rate) reduction against G-PCCv20 Octree.
arXiv Detail & Related papers (2022-07-25T22:17:19Z) - Learned Video Compression via Heterogeneous Deformable Compensation
Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance.
More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets.
Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z) - Extending Neural P-frame Codecs for B-frame Coding [15.102346715690755]
Our B-frame solution is based on the existing P-frame methods.
Our results show that using the proposed method with an existing P-frame can lead to 28.5%saving in bit-rate on the UVG dataset.
arXiv Detail & Related papers (2021-03-30T21:25:35Z) - Conditional Entropy Coding for Efficient Video Compression [82.35389813794372]
We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.
We first show that a simple architecture modeling the entropy between the image latent codes is as competitive as other neural video compression works and video codecs.
We then propose a novel internal learning extension on top of this architecture that brings an additional 10% savings without trading off decoding speed.
arXiv Detail & Related papers (2020-08-20T20:01:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.