UAR-NVC: A Unified AutoRegressive Framework for Memory-Efficient Neural Video Compression
- URL: http://arxiv.org/abs/2503.02733v1
- Date: Tue, 04 Mar 2025 15:54:57 GMT
- Title: UAR-NVC: A Unified AutoRegressive Framework for Memory-Efficient Neural Video Compression
- Authors: Jia Wang, Xinfeng Zhang, Gai Zhang, Jun Zhu, Lv Tang, Li Zhang,
- Abstract summary: Implicit Neural Representations (INRs) have demonstrated significant potential in video compression by representing videos as neural networks.<n>We present a novel understanding of INR models from an autoregressive (AR) perspective and introduce a Unified AutoRegressive Framework for memory-efficient Neural Video Compression (UAR-NVC)<n>UAR-NVC integrates timeline-based and INR-based neural video compression under a unified autoregressive paradigm.
- Score: 29.174318150967405
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Implicit Neural Representations (INRs) have demonstrated significant potential in video compression by representing videos as neural networks. However, as the number of frames increases, the memory consumption for training and inference increases substantially, posing challenges in resource-constrained scenarios. Inspired by the success of traditional video compression frameworks, which process video frame by frame and can efficiently compress long videos, we adopt this modeling strategy for INRs to decrease memory consumption, while aiming to unify the frameworks from the perspective of timeline-based autoregressive modeling. In this work, we present a novel understanding of INR models from an autoregressive (AR) perspective and introduce a Unified AutoRegressive Framework for memory-efficient Neural Video Compression (UAR-NVC). UAR-NVC integrates timeline-based and INR-based neural video compression under a unified autoregressive paradigm. It partitions videos into several clips and processes each clip using a different INR model instance, leveraging the advantages of both compression frameworks while allowing seamless adaptation to either in form. To further reduce temporal redundancy between clips, we design two modules to optimize the initialization, training, and compression of these model parameters. UAR-NVC supports adjustable latencies by varying the clip length. Extensive experimental results demonstrate that UAR-NVC, with its flexible video clip setting, can adapt to resource-constrained environments and significantly improve performance compared to different baseline models.
Related papers
- Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression [90.59962443790593]
In this paper, we present a variable-rate image compression model based on invertible transform to overcome limitations.
Specifically, we design a lightweight multi-scale invertible neural network, which maps the input image into multi-scale latent representations.
Experimental results demonstrate that the proposed method achieves state-of-the-art performance compared to existing variable-rate methods.
arXiv Detail & Related papers (2025-03-27T09:08:39Z) - CANeRV: Content Adaptive Neural Representation for Video Compression [89.35616046528624]
We propose Content Adaptive Neural Representation for Video Compression (CANeRV)<n>CANeRV is an innovative INR-based video compression network that adaptively conducts structure optimisation based on the specific content of each video sequence.<n>We show that CANeRV can outperform both H.266/VVC and state-of-the-art INR-based video compression techniques across diverse video datasets.
arXiv Detail & Related papers (2025-02-10T06:21:16Z) - VRVVC: Variable-Rate NeRF-Based Volumetric Video Compression [59.14355576912495]
NeRF-based video has revolutionized visual media by delivering photorealistic Free-Viewpoint Video (FVV) experiences.<n>The substantial data volumes pose significant challenges for storage and transmission.<n>We propose VRVVC, a novel end-to-end joint variable-rate framework for video compression.
arXiv Detail & Related papers (2024-12-16T01:28:04Z) - Releasing the Parameter Latency of Neural Representation for High-Efficiency Video Compression [18.769136361963472]
implicit neural representation (INR) technique models entire videos as basic units, automatically capturing intra-frame and inter-frame correlations.
In this paper, we show that our method significantly enhances the rate-distortion performance of INR video compression.
arXiv Detail & Related papers (2024-10-02T15:19:31Z) - NVRC: Neural Video Representation Compression [13.131842990481038]
We propose a novel INR-based video compression framework, Neural Video Representation Compression (NVRC)
NVRC, for the first time, is able to optimize an INR-based video in a fully end-to-end manner.
Our experiments show that NVRC outperforms many conventional and learning-based benchmark entropy.
arXiv Detail & Related papers (2024-09-11T16:57:12Z) - Boosting Neural Representations for Videos with a Conditional Decoder [28.073607937396552]
Implicit neural representations (INRs) have emerged as a promising approach for video storage and processing.
This paper introduces a universal boosting framework for current implicit video representation approaches.
arXiv Detail & Related papers (2024-02-28T08:32:19Z) - Scene Matters: Model-based Deep Video Compression [13.329074811293292]
We propose a model-based video compression (MVC) framework that regards scenes as the fundamental units for video sequences.
Our proposed MVC directly models novel intensity variation of the entire video sequence in one scene, seeking non-redundant representations instead of reducing redundancy.
Our method achieves up to a 20% reduction compared to the latest video standard H.266 and is more efficient in decoding than existing video coding strategies.
arXiv Detail & Related papers (2023-03-08T13:15:19Z) - Recurrent Video Restoration Transformer with Guided Deformable Attention [116.1684355529431]
We propose RVRT, which processes local neighboring frames in parallel within a globally recurrent framework.
RVRT achieves state-of-the-art performance on benchmark datasets with balanced model size, testing memory and runtime.
arXiv Detail & Related papers (2022-06-05T10:36:09Z) - Slimmable Video Codec [24.460763016660685]
We propose a slimmable video (SlimVC) by integrating a slimmable temporal entropy model in a slimmable autoencoder.
Despite a significantly more complex architecture, we show that slimming remains a powerful mechanism to control rate, memory footprint, computational cost and latency.
arXiv Detail & Related papers (2022-05-13T16:37:27Z) - Conditional Entropy Coding for Efficient Video Compression [82.35389813794372]
We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.
We first show that a simple architecture modeling the entropy between the image latent codes is as competitive as other neural video compression works and video codecs.
We then propose a novel internal learning extension on top of this architecture that brings an additional 10% savings without trading off decoding speed.
arXiv Detail & Related papers (2020-08-20T20:01:59Z) - Learning for Video Compression with Recurrent Auto-Encoder and Recurrent
Probability Model [164.7489982837475]
This paper proposes a Recurrent Learned Video Compression (RLVC) approach with the Recurrent Auto-Encoder (RAE) and Recurrent Probability Model ( RPM)
The RAE employs recurrent cells in both the encoder and decoder to exploit the temporal correlation among video frames.
Our approach achieves the state-of-the-art learned video compression performance in terms of both PSNR and MS-SSIM.
arXiv Detail & Related papers (2020-06-24T08:46:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.