Implicit Neural Video Compression
- URL: http://arxiv.org/abs/2112.11312v1
- Date: Tue, 21 Dec 2021 15:59:00 GMT
- Title: Implicit Neural Video Compression
- Authors: Yunfan Zhang, Ties van Rozendaal, Johann Brehmer, Markus Nagel, Taco
Cohen
- Abstract summary: We propose a method to compress full-resolution video sequences with implicit neural representations.
Each frame is represented as a neural network that maps coordinate positions to pixel values.
We use a separate implicit network to modulate the coordinate inputs, which enables efficient motion compensation between frames.
- Score: 17.873088127087605
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a method to compress full-resolution video sequences with implicit
neural representations. Each frame is represented as a neural network that maps
coordinate positions to pixel values. We use a separate implicit network to
modulate the coordinate inputs, which enables efficient motion compensation
between frames. Together with a small residual network, this allows us to
efficiently compress P-frames relative to the previous frame. We further lower
the bitrate by storing the network weights with learned integer quantization.
Our method, which we call implicit pixel flow (IPF), offers several
simplifications over established neural video codecs: it does not require the
receiver to have access to a pretrained neural network, does not use expensive
interpolation-based warping operations, and does not require a separate
training dataset. We demonstrate the feasibility of neural implicit compression
on image and video data.
Related papers
- NERV++: An Enhanced Implicit Neural Video Representation [11.25130799452367]
We introduce neural representations for videos NeRV++, an enhanced implicit neural video representation.
NeRV++ is more straightforward yet effective enhancement over the original NeRV decoder architecture.
We evaluate our method on UVG, MCL JVC, and Bunny datasets, achieving competitive results for video compression with INRs.
arXiv Detail & Related papers (2024-02-28T13:00:32Z) - Progressive Fourier Neural Representation for Sequential Video
Compilation [75.43041679717376]
Motivated by continual learning, this work investigates how to accumulate and transfer neural implicit representations for multiple complex video data over sequential encoding sessions.
We propose a novel method, Progressive Fourier Neural Representation (PFNR), that aims to find an adaptive and compact sub-module in Fourier space to encode videos in each training session.
We validate our PFNR method on the UVG8/17 and DAVIS50 video sequence benchmarks and achieve impressive performance gains over strong continual learning baselines.
arXiv Detail & Related papers (2023-06-20T06:02:19Z) - Towards Scalable Neural Representation for Diverse Videos [68.73612099741956]
Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images.
Existing INR-based methods are limited to encoding a handful of short videos with redundant visual content.
This paper focuses on developing neural representations for encoding long and/or a large number of videos with diverse visual content.
arXiv Detail & Related papers (2023-03-24T16:32:19Z) - Scalable Neural Video Representations with Learnable Positional Features [73.51591757726493]
We show how to train neural representations with learnable positional features (NVP) that effectively amortize a video as latent codes.
We demonstrate the superiority of NVP on the popular UVG benchmark; compared with prior arts, NVP not only trains 2 times faster (less than 5 minutes) but also exceeds their encoding quality as 34.07rightarrow$34.57 (measured with the PSNR metric)
arXiv Detail & Related papers (2022-10-13T08:15:08Z) - Variable Bitrate Neural Fields [75.24672452527795]
We present a dictionary method for compressing feature grids, reducing their memory consumption by up to 100x.
We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available.
arXiv Detail & Related papers (2022-06-15T17:58:34Z) - COIN++: Data Agnostic Neural Compression [55.27113889737545]
COIN++ is a neural compression framework that seamlessly handles a wide range of data modalities.
We demonstrate the effectiveness of our method by compressing various data modalities.
arXiv Detail & Related papers (2022-01-30T20:12:04Z) - Instant Neural Graphics Primitives with a Multiresolution Hash Encoding [67.33850633281803]
We present a versatile new input encoding that permits the use of a smaller network without sacrificing quality.
A small neural network is augmented by a multiresolution hash table of trainable feature vectors whose values are optimized through a gradient descent.
We achieve a combined speed of several orders of magnitude, enabling training of high-quality neural graphics primitives in a matter of seconds.
arXiv Detail & Related papers (2022-01-16T07:22:47Z) - Improved CNN-based Learning of Interpolation Filters for Low-Complexity
Inter Prediction in Video Coding [5.46121027847413]
This paper introduces a novel explainable neural network-based inter-prediction scheme.
A novel training framework enables each network branch to resemble a specific fractional shift.
When implemented in the context of the Versatile Video Coding (VVC) test model, 0.77%, 1.27% and 2.25% BD-rate savings can be achieved.
arXiv Detail & Related papers (2021-06-16T16:48:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.