Extreme Video Compression with Pre-trained Diffusion Models
- URL: http://arxiv.org/abs/2402.08934v1
- Date: Wed, 14 Feb 2024 04:23:05 GMT
- Title: Extreme Video Compression with Pre-trained Diffusion Models
- Authors: Bohan Li, Yiming Liu, Xueyan Niu, Bo Bai, Lei Deng, and Deniz
G\"und\"uz
- Abstract summary: We present a novel approach to extreme video compression leveraging the predictive power of diffusion-based generative models at the decoder.
The entire video is sequentially encoded to achieve a visually pleasing reconstruction, considering perceptual quality metrics.
Results showcase the potential of exploiting the temporal relations in video data using generative models.
- Score: 11.898317376595697
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models have achieved remarkable success in generating high quality
image and video data. More recently, they have also been used for image
compression with high perceptual quality. In this paper, we present a novel
approach to extreme video compression leveraging the predictive power of
diffusion-based generative models at the decoder. The conditional diffusion
model takes several neural compressed frames and generates subsequent frames.
When the reconstruction quality drops below the desired level, new frames are
encoded to restart prediction. The entire video is sequentially encoded to
achieve a visually pleasing reconstruction, considering perceptual quality
metrics such as the learned perceptual image patch similarity (LPIPS) and the
Frechet video distance (FVD), at bit rates as low as 0.02 bits per pixel (bpp).
Experimental results demonstrate the effectiveness of the proposed scheme
compared to standard codecs such as H.264 and H.265 in the low bpp regime. The
results showcase the potential of exploiting the temporal relations in video
data using generative models. Code is available at:
https://github.com/ElesionKyrie/Extreme-Video-Compression-With-Prediction-Using-Pre-trainded-Diffusi on-Models-
Related papers
- Correcting Diffusion-Based Perceptual Image Compression with Privileged End-to-End Decoder [49.01721042973929]
This paper presents a diffusion-based image compression method that employs a privileged end-to-end decoder model as correction.
Experiments demonstrate the superiority of our method in both distortion and perception compared with previous perceptual compression methods.
arXiv Detail & Related papers (2024-04-07T10:57:54Z) - Predictive Coding For Animation-Based Video Compression [13.161311799049978]
We propose a predictive coding scheme which uses image animation as a predictor, and codes the residual with respect to the actual target frame.
Our experiments indicate a significant gain, in excess of 70% compared to the HEVC video standard and over 30% compared to VVC.
arXiv Detail & Related papers (2023-07-09T14:40:54Z) - Video Coding Using Learned Latent GAN Compression [1.6058099298620423]
We leverage the generative capacity of GANs such as StyleGAN to represent and compress a video.
Each frame is inverted in the latent space of StyleGAN, from which the optimal compression is learned.
arXiv Detail & Related papers (2022-07-09T19:07:43Z) - Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed
Video Quality Enhancement [74.1052624663082]
We develop a deep learning architecture capable of restoring detail to compressed videos.
We show that this improves restoration accuracy compared to prior compression correction methods.
We condition our model on quantization data which is readily available in the bitstream.
arXiv Detail & Related papers (2022-01-31T18:56:04Z) - Perceptual Learned Video Compression with Recurrent Conditional GAN [158.0726042755]
We propose a Perceptual Learned Video Compression (PLVC) approach with recurrent conditional generative adversarial network.
PLVC learns to compress video towards good perceptual quality at low bit-rate.
The user study further validates the outstanding perceptual performance of PLVC in comparison with the latest learned video compression approaches.
arXiv Detail & Related papers (2021-09-07T13:36:57Z) - Overfitting for Fun and Profit: Instance-Adaptive Data Compression [20.764189960709164]
Neural data compression has been shown to outperform classical methods in terms of $RD$ performance.
In this paper we take this concept to the extreme, adapting the full model to a single video, and sending model updates along with the latent representation.
We demonstrate that full-model adaptation improves $RD$ performance by 1 dB, with respect to encoder-only finetuning.
arXiv Detail & Related papers (2021-01-21T15:58:58Z) - Conditional Entropy Coding for Efficient Video Compression [82.35389813794372]
We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.
We first show that a simple architecture modeling the entropy between the image latent codes is as competitive as other neural video compression works and video codecs.
We then propose a novel internal learning extension on top of this architecture that brings an additional 10% savings without trading off decoding speed.
arXiv Detail & Related papers (2020-08-20T20:01:59Z) - Learning for Video Compression with Recurrent Auto-Encoder and Recurrent
Probability Model [164.7489982837475]
This paper proposes a Recurrent Learned Video Compression (RLVC) approach with the Recurrent Auto-Encoder (RAE) and Recurrent Probability Model ( RPM)
The RAE employs recurrent cells in both the encoder and decoder to exploit the temporal correlation among video frames.
Our approach achieves the state-of-the-art learned video compression performance in terms of both PSNR and MS-SSIM.
arXiv Detail & Related papers (2020-06-24T08:46:33Z) - Variable Rate Video Compression using a Hybrid Recurrent Convolutional
Learning Framework [1.9290392443571382]
This paper presents PredEncoder, a hybrid video compression framework based on the concept of predictive auto-encoding.
A variable-rate block encoding scheme has been proposed in the paper that leads to remarkably high quality to bit-rate ratios.
arXiv Detail & Related papers (2020-04-08T20:49:25Z) - Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system.
Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame.
Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.