Neural Video Compression with Diverse Contexts
- URL: http://arxiv.org/abs/2302.14402v1
- Date: Tue, 28 Feb 2023 08:35:50 GMT
- Title: Neural Video Compression with Diverse Contexts
- Authors: Jiahao Li, Bin Li, Yan Lu
- Abstract summary: This paper proposes increasing the context diversity in both temporal and spatial dimensions.
Experiments show that our codes obtains 23.5% saving over previous SOTA NVC.
- Score: 25.96187914295921
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For any video codecs, the coding efficiency highly relies on whether the
current signal to be encoded can find the relevant contexts from the previous
reconstructed signals. Traditional codec has verified more contexts bring
substantial coding gain, but in a time-consuming manner. However, for the
emerging neural video codec (NVC), its contexts are still limited, leading to
low compression ratio. To boost NVC, this paper proposes increasing the context
diversity in both temporal and spatial dimensions. First, we guide the model to
learn hierarchical quality patterns across frames, which enriches long-term and
yet high-quality temporal contexts. Furthermore, to tap the potential of
optical flow-based coding framework, we introduce a group-based offset
diversity where the cross-group interaction is proposed for better context
mining. In addition, this paper also adopts a quadtree-based partition to
increase spatial context diversity when encoding the latent representation in
parallel. Experiments show that our codec obtains 23.5% bitrate saving over
previous SOTA NVC. Better yet, our codec has surpassed the under-developing
next generation traditional codec/ECM in both RGB and YUV420 colorspaces, in
terms of PSNR. The codes are at https://github.com/microsoft/DCVC.
Related papers
- PNVC: Towards Practical INR-based Video Compression [14.088444622391501]
We propose a novel INR-based coding framework, PNVC, which innovatively combines autoencoder-based and overfitted solutions.
PNVC achieves nearly 35%+ BD-rate savings against HEVC HM 18.0 (LD) - almost 10% more compared to one of the state-of-the-art INR-based codecs.
arXiv Detail & Related papers (2024-09-02T05:31:11Z) - When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding [118.72266141321647]
Cross-Modality Video Coding (CMVC) is a pioneering approach to explore multimodality representation and video generative models in video coding.
During decoding, previously encoded components and video generation models are leveraged to create multiple encoding-decoding modes.
Experiments indicate that TT2V achieves effective semantic reconstruction, while IT2V exhibits competitive perceptual consistency.
arXiv Detail & Related papers (2024-08-15T11:36:18Z) - Boosting Neural Representations for Videos with a Conditional Decoder [28.073607937396552]
Implicit neural representations (INRs) have emerged as a promising approach for video storage and processing.
This paper introduces a universal boosting framework for current implicit video representation approaches.
arXiv Detail & Related papers (2024-02-28T08:32:19Z) - Progressive Fourier Neural Representation for Sequential Video
Compilation [75.43041679717376]
Motivated by continual learning, this work investigates how to accumulate and transfer neural implicit representations for multiple complex video data over sequential encoding sessions.
We propose a novel method, Progressive Fourier Neural Representation (PFNR), that aims to find an adaptive and compact sub-module in Fourier space to encode videos in each training session.
We validate our PFNR method on the UVG8/17 and DAVIS50 video sequence benchmarks and achieve impressive performance gains over strong continual learning baselines.
arXiv Detail & Related papers (2023-06-20T06:02:19Z) - HiNeRV: Video Compression with Hierarchical Encoding-based Neural
Representation [14.088444622391501]
Implicit Representations (INRs) have previously been used to represent and compress image and video content.
Existing INR-based methods have failed to deliver rate quality performance comparable with the state of the art in video compression.
We propose HiNeRV, an INR that combines light weight layers with hierarchical positional encodings.
arXiv Detail & Related papers (2023-06-16T12:59:52Z) - Scalable Neural Video Representations with Learnable Positional Features [73.51591757726493]
We show how to train neural representations with learnable positional features (NVP) that effectively amortize a video as latent codes.
We demonstrate the superiority of NVP on the popular UVG benchmark; compared with prior arts, NVP not only trains 2 times faster (less than 5 minutes) but also exceeds their encoding quality as 34.07rightarrow$34.57 (measured with the PSNR metric)
arXiv Detail & Related papers (2022-10-13T08:15:08Z) - CANF-VC: Conditional Augmented Normalizing Flows for Video Compression [81.41594331948843]
CANF-VC is an end-to-end learning-based video compression system.
It is based on conditional augmented normalizing flows (ANF)
arXiv Detail & Related papers (2022-07-12T04:53:24Z) - Efficient VVC Intra Prediction Based on Deep Feature Fusion and
Probability Estimation [57.66773945887832]
We propose to optimize Versatile Video Coding (VVC) complexity at intra-frame prediction, with a two-stage framework of deep feature fusion and probability estimation.
Experimental results on standard database demonstrate the superiority of proposed method, especially for High Definition (HD) and Ultra-HD (UHD) video sequences.
arXiv Detail & Related papers (2022-05-07T08:01:32Z) - A Coding Framework and Benchmark towards Low-Bitrate Video Understanding [63.05385140193666]
We propose a traditional-neural mixed coding framework that takes advantage of both traditional codecs and neural networks (NNs)
The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved.
We build a low-bitrate video understanding benchmark with three downstream tasks on eight datasets, demonstrating the notable superiority of our approach.
arXiv Detail & Related papers (2022-02-06T16:29:15Z) - Adaptation and Attention for Neural Video Coding [23.116987835862314]
We propose an end-to-end learned video that introduces several architectural novelties as well as training novelties.
As one architectural novelty, we propose to train the inter-frame model to adapt the motion estimation process based on the resolution of the input video.
A second architectural novelty is a new neural block that combines concepts from split-attention based neural networks and from DenseNets.
arXiv Detail & Related papers (2021-12-16T10:25:49Z) - Variable Rate Video Compression using a Hybrid Recurrent Convolutional
Learning Framework [1.9290392443571382]
This paper presents PredEncoder, a hybrid video compression framework based on the concept of predictive auto-encoding.
A variable-rate block encoding scheme has been proposed in the paper that leads to remarkably high quality to bit-rate ratios.
arXiv Detail & Related papers (2020-04-08T20:49:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.