Generative Video Compression: Towards 0.01% Compression Rate for Video Transmission
- URL: http://arxiv.org/abs/2512.24300v1
- Date: Tue, 30 Dec 2025 15:41:33 GMT
- Title: Generative Video Compression: Towards 0.01% Compression Rate for Video Transmission
- Authors: Xiangyu Chen, Jixiang Luo, Jingyu Xu, Fangqiu Yi, Chi Zhang, Xuelong Li,
- Abstract summary: We introduce Generative Video Compression (GVC), a new framework that redefines the limits of video compression.<n>GVC encodes video into compact representations and delegates content reconstruction to the receiver.<n>Within the AI Flow framework, GVC opens new possibility for video communication in bandwidth- and resource-constrained environments.
- Score: 43.034936334574155
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Whether a video can be compressed at an extreme compression rate as low as 0.01%? To this end, we achieve the compression rate as 0.02% at some cases by introducing Generative Video Compression (GVC), a new framework that redefines the limits of video compression by leveraging modern generative video models to achieve extreme compression rates while preserving a perception-centric, task-oriented communication paradigm, corresponding to Level C of the Shannon-Weaver model. Besides, How we trade computation for compression rate or bandwidth? GVC answers this question by shifting the burden from transmission to inference: it encodes video into extremely compact representations and delegates content reconstruction to the receiver, where powerful generative priors synthesize high-quality video from minimal transmitted information. Is GVC practical and deployable? To ensure practical deployment, we propose a compression-computation trade-off strategy, enabling fast inference on consume-grade GPUs. Within the AI Flow framework, GVC opens new possibility for video communication in bandwidth- and resource-constrained environments such as emergency rescue, remote surveillance, and mobile edge computing. Through empirical validation, we demonstrate that GVC offers a viable path toward a new effective, efficient, scalable, and practical video communication paradigm.
Related papers
- Embedding Compression Distortion in Video Coding for Machines [67.97469042910855]
Currently, video transmission serves not only the Human Visual System (HVS) for viewing but also machine perception for analysis.<n>We propose a Compression Distortion Embedding (CDRE) framework, which extracts machine-perception-related distortion representation and embeds it into downstream models.<n>Our framework can effectively boost the rate-task performance of existing codecs with minimal overhead in terms of execution time, and number of parameters.
arXiv Detail & Related papers (2025-03-27T13:01:53Z) - REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder [52.698595889988766]
We present a novel perspective on learning video embedders for generative modeling.<n>Rather than requiring an exact reproduction of an input video, an effective embedder should focus on visually plausible reconstructions.<n>We propose replacing the conventional encoder-decoder video embedder with an encoder-generator framework.
arXiv Detail & Related papers (2025-03-11T17:51:07Z) - Pleno-Generation: A Scalable Generative Face Video Compression Framework with Bandwidth Intelligence [19.137109044483545]
Pleno-Generation (PGen) framework prioritizes high-fidelity reconstruction over pursuing compact bitstream.<n>We show that the proposed framework can allow a greater space of flexibility for coding applications.<n>In comparison with the latest Versatile Video Coding (VVC), the proposed scheme achieves competitive Bjontegaard-delta-rate savings.
arXiv Detail & Related papers (2025-02-24T12:03:30Z) - Accelerating Learned Video Compression via Low-Resolution Representation Learning [18.399027308582596]
We introduce an efficiency-optimized framework for learned video compression that focuses on low-resolution representation learning.
Our method achieves performance levels on par with the low-decay P configuration of the H.266 reference software VTM.
arXiv Detail & Related papers (2024-07-23T12:02:57Z) - Foveation-based Deep Video Compression without Motion Search [43.70396515286677]
Foveation protocols are desirable since only a small portion of a video viewed in VR may be visible as a user gazes in any given direction.
We implement foveation by introducing a Foveation Generator Unit (FGU) that generates foveation masks which direct the allocation of bits.
Our new compression model, which we call the Foveated MOtionless VIdeo Codec (Foveated MOVI-Codec), is able to efficiently compress videos without computing motion.
arXiv Detail & Related papers (2022-03-30T17:30:17Z) - Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed
Video Quality Enhancement [74.1052624663082]
We develop a deep learning architecture capable of restoring detail to compressed videos.
We show that this improves restoration accuracy compared to prior compression correction methods.
We condition our model on quantization data which is readily available in the bitstream.
arXiv Detail & Related papers (2022-01-31T18:56:04Z) - Perceptual Learned Video Compression with Recurrent Conditional GAN [158.0726042755]
We propose a Perceptual Learned Video Compression (PLVC) approach with recurrent conditional generative adversarial network.
PLVC learns to compress video towards good perceptual quality at low bit-rate.
The user study further validates the outstanding perceptual performance of PLVC in comparison with the latest learned video compression approaches.
arXiv Detail & Related papers (2021-09-07T13:36:57Z) - Conditional Entropy Coding for Efficient Video Compression [82.35389813794372]
We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.
We first show that a simple architecture modeling the entropy between the image latent codes is as competitive as other neural video compression works and video codecs.
We then propose a novel internal learning extension on top of this architecture that brings an additional 10% savings without trading off decoding speed.
arXiv Detail & Related papers (2020-08-20T20:01:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.