NeRV: Neural Representations for Videos
- URL: http://arxiv.org/abs/2110.13903v1
- Date: Tue, 26 Oct 2021 17:56:23 GMT
- Title: NeRV: Neural Representations for Videos
- Authors: Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav
Shrivastava
- Abstract summary: We propose a novel neural representation for videos (NeRV) which encodes videos in neural networks.
NeRV is simply fitting a neural network to video frames and decoding process is a simple feedforward operation.
With such a representation, we can treat videos as neural networks, simplifying several video-related tasks.
- Score: 36.00198388959609
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We propose a novel neural representation for videos (NeRV) which encodes
videos in neural networks. Unlike conventional representations that treat
videos as frame sequences, we represent videos as neural networks taking frame
index as input. Given a frame index, NeRV outputs the corresponding RGB image.
Video encoding in NeRV is simply fitting a neural network to video frames and
decoding process is a simple feedforward operation. As an image-wise implicit
representation, NeRV output the whole image and shows great efficiency compared
to pixel-wise implicit representation, improving the encoding speed by 25x to
70x, the decoding speed by 38x to 132x, while achieving better video quality.
With such a representation, we can treat videos as neural networks, simplifying
several video-related tasks. For example, conventional video compression
methods are restricted by a long and complex pipeline, specifically designed
for the task. In contrast, with NeRV, we can use any neural network compression
method as a proxy for video compression, and achieve comparable performance to
traditional frame-based video compression approaches (H.264, HEVC \etc).
Besides compression, we demonstrate the generalization of NeRV for video
denoising. The source code and pre-trained model can be found at
https://github.com/haochen-rye/NeRV.git.
Related papers
- Fast Encoding and Decoding for Implicit Video Representation [88.43612845776265]
We introduce NeRV-Enc, a transformer-based hyper-network for fast encoding; and NeRV-Dec, a parallel decoder for efficient video loading.
NeRV-Enc achieves an impressive speed-up of $mathbf104times$ by eliminating gradient-based optimization.
NeRV-Dec simplifies video decoding, outperforming conventional codecs with a loading speed $mathbf11times$ faster.
arXiv Detail & Related papers (2024-09-28T18:21:52Z) - NERV++: An Enhanced Implicit Neural Video Representation [11.25130799452367]
We introduce neural representations for videos NeRV++, an enhanced implicit neural video representation.
NeRV++ is more straightforward yet effective enhancement over the original NeRV decoder architecture.
We evaluate our method on UVG, MCL JVC, and Bunny datasets, achieving competitive results for video compression with INRs.
arXiv Detail & Related papers (2024-02-28T13:00:32Z) - HNeRV: A Hybrid Neural Representation for Videos [56.492309149698606]
Implicit neural representations store videos as neural networks.
We propose a Hybrid Neural Representation for Videos (HNeRV)
With content-adaptive embeddings and re-designed architecture, HNeRV outperforms implicit methods in video regression tasks.
arXiv Detail & Related papers (2023-04-05T17:55:04Z) - Towards Scalable Neural Representation for Diverse Videos [68.73612099741956]
Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images.
Existing INR-based methods are limited to encoding a handful of short videos with redundant visual content.
This paper focuses on developing neural representations for encoding long and/or a large number of videos with diverse visual content.
arXiv Detail & Related papers (2023-03-24T16:32:19Z) - CNeRV: Content-adaptive Neural Representation for Visual Data [54.99373641890767]
We propose Neural Visual Representation with Content-adaptive Embedding (CNeRV), which combines the generalizability of autoencoders with the simplicity and compactness of implicit representation.
We match the performance of NeRV, a state-of-the-art implicit neural representation, on the reconstruction task for frames seen during training while far surpassing for frames that are skipped during training (unseen images)
With the same latent code length and similar model size, CNeRV outperforms autoencoders on reconstruction of both seen and unseen images.
arXiv Detail & Related papers (2022-11-18T18:35:43Z) - Scalable Neural Video Representations with Learnable Positional Features [73.51591757726493]
We show how to train neural representations with learnable positional features (NVP) that effectively amortize a video as latent codes.
We demonstrate the superiority of NVP on the popular UVG benchmark; compared with prior arts, NVP not only trains 2 times faster (less than 5 minutes) but also exceeds their encoding quality as 34.07rightarrow$34.57 (measured with the PSNR metric)
arXiv Detail & Related papers (2022-10-13T08:15:08Z) - PS-NeRV: Patch-wise Stylized Neural Representations for Videos [13.14511356472246]
PS-NeRV represents videos as a function of patches and the corresponding patch coordinate.
It naturally inherits the advantages of image-wise methods, and achieves excellent reconstruction performance with fast decoding speed.
arXiv Detail & Related papers (2022-08-07T14:45:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.