CANF-VC: Conditional Augmented Normalizing Flows for Video Compression
- URL: http://arxiv.org/abs/2207.05315v1
- Date: Tue, 12 Jul 2022 04:53:24 GMT
- Title: CANF-VC: Conditional Augmented Normalizing Flows for Video Compression
- Authors: Yung-Han Ho, Chih-Peng Chang, Peng-Yu Chen, Alessandro Gnutti,
Wen-Hsiao Peng
- Abstract summary: CANF-VC is an end-to-end learning-based video compression system.
It is based on conditional augmented normalizing flows (ANF)
- Score: 81.41594331948843
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper presents an end-to-end learning-based video compression system,
termed CANF-VC, based on conditional augmented normalizing flows (ANF). Most
learned video compression systems adopt the same hybrid-based coding
architecture as the traditional codecs. Recent research on conditional coding
has shown the sub-optimality of the hybrid-based coding and opens up
opportunities for deep generative models to take a key role in creating new
coding frameworks. CANF-VC represents a new attempt that leverages the
conditional ANF to learn a video generative model for conditional inter-frame
coding. We choose ANF because it is a special type of generative model, which
includes variational autoencoder as a special case and is able to achieve
better expressiveness. CANF-VC also extends the idea of conditional coding to
motion coding, forming a purely conditional coding framework. Extensive
experimental results on commonly used datasets confirm the superiority of
CANF-VC to the state-of-the-art methods.
Related papers
- Beyond GFVC: A Progressive Face Video Compression Framework with Adaptive Visual Tokens [28.03183316628635]
This paper proposes a novel Progressive Face Video Compression framework, namely PFVC, that utilizes adaptive visual tokens to realize exceptional trade-offs between reconstruction and bandwidth intelligence.
Experimental results demonstrate that the proposed PFVC framework can achieve better coding flexibility and superior rate-distortion performance in comparison with the latest Versatile Video Coding (VVC) and the state-of-the-art Generative Face Video Compression (GFVC) algorithms.
arXiv Detail & Related papers (2024-10-11T03:24:21Z) - When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding [112.44822009714461]
Cross-Modality Video Coding (CMVC) is a pioneering approach to explore multimodality representation and video generative models in video coding.
During decoding, previously encoded components and video generation models are leveraged to create multiple encoding-decoding modes.
Experiments indicate that TT2V achieves effective semantic reconstruction, while IT2V exhibits competitive perceptual consistency.
arXiv Detail & Related papers (2024-08-15T11:36:18Z) - Boosting Neural Representations for Videos with a Conditional Decoder [28.073607937396552]
Implicit neural representations (INRs) have emerged as a promising approach for video storage and processing.
This paper introduces a universal boosting framework for current implicit video representation approaches.
arXiv Detail & Related papers (2024-02-28T08:32:19Z) - Neural Data-Dependent Transform for Learned Image Compression [72.86505042102155]
We build a neural data-dependent transform and introduce a continuous online mode decision mechanism to jointly optimize the coding efficiency for each individual image.
The experimental results show the effectiveness of the proposed neural-syntax design and the continuous online mode decision mechanism.
arXiv Detail & Related papers (2022-03-09T14:56:48Z) - A Coding Framework and Benchmark towards Low-Bitrate Video Understanding [63.05385140193666]
We propose a traditional-neural mixed coding framework that takes advantage of both traditional codecs and neural networks (NNs)
The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved.
We build a low-bitrate video understanding benchmark with three downstream tasks on eight datasets, demonstrating the notable superiority of our approach.
arXiv Detail & Related papers (2022-02-06T16:29:15Z) - CAESR: Conditional Autoencoder and Super-Resolution for Learned Spatial
Scalability [13.00115213941287]
We present CAESR, a learning-based coding approach for spatial scalability based on the versatile video coding (VVC) standard.
Our framework considers a low-resolution signal encoded with VVC intra-mode as a base-layer (BL), and a deep conditional autoencoder with hyperprior (AE-HP) as an enhancement-layer (EL) model.
Our solution is competitive with the VVC full-resolution intra coding while being scalable.
arXiv Detail & Related papers (2022-02-01T13:59:43Z) - Perceptual Learned Video Compression with Recurrent Conditional GAN [158.0726042755]
We propose a Perceptual Learned Video Compression (PLVC) approach with recurrent conditional generative adversarial network.
PLVC learns to compress video towards good perceptual quality at low bit-rate.
The user study further validates the outstanding perceptual performance of PLVC in comparison with the latest learned video compression approaches.
arXiv Detail & Related papers (2021-09-07T13:36:57Z) - Conditional Entropy Coding for Efficient Video Compression [82.35389813794372]
We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.
We first show that a simple architecture modeling the entropy between the image latent codes is as competitive as other neural video compression works and video codecs.
We then propose a novel internal learning extension on top of this architecture that brings an additional 10% savings without trading off decoding speed.
arXiv Detail & Related papers (2020-08-20T20:01:59Z) - Variable Rate Video Compression using a Hybrid Recurrent Convolutional
Learning Framework [1.9290392443571382]
This paper presents PredEncoder, a hybrid video compression framework based on the concept of predictive auto-encoding.
A variable-rate block encoding scheme has been proposed in the paper that leads to remarkably high quality to bit-rate ratios.
arXiv Detail & Related papers (2020-04-08T20:49:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.