CNeRV: Content-adaptive Neural Representation for Visual Data
- URL: http://arxiv.org/abs/2211.10421v1
- Date: Fri, 18 Nov 2022 18:35:43 GMT
- Title: CNeRV: Content-adaptive Neural Representation for Visual Data
- Authors: Hao Chen, Matt Gwilliam, Bo He, Ser-Nam Lim, Abhinav Shrivastava
- Abstract summary: We propose Neural Visual Representation with Content-adaptive Embedding (CNeRV), which combines the generalizability of autoencoders with the simplicity and compactness of implicit representation.
We match the performance of NeRV, a state-of-the-art implicit neural representation, on the reconstruction task for frames seen during training while far surpassing for frames that are skipped during training (unseen images)
With the same latent code length and similar model size, CNeRV outperforms autoencoders on reconstruction of both seen and unseen images.
- Score: 54.99373641890767
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Compression and reconstruction of visual data have been widely studied in the
computer vision community, even before the popularization of deep learning.
More recently, some have used deep learning to improve or refine existing
pipelines, while others have proposed end-to-end approaches, including
autoencoders and implicit neural representations, such as SIREN and NeRV. In
this work, we propose Neural Visual Representation with Content-adaptive
Embedding (CNeRV), which combines the generalizability of autoencoders with the
simplicity and compactness of implicit representation. We introduce a novel
content-adaptive embedding that is unified, concise, and internally
(within-video) generalizable, that compliments a powerful decoder with a
single-layer encoder. We match the performance of NeRV, a state-of-the-art
implicit neural representation, on the reconstruction task for frames seen
during training while far surpassing for frames that are skipped during
training (unseen images). To achieve similar reconstruction quality on unseen
images, NeRV needs 120x more time to overfit per-frame due to its lack of
internal generalization. With the same latent code length and similar model
size, CNeRV outperforms autoencoders on reconstruction of both seen and unseen
images. We also show promising results for visual data compression. More
details can be found in the project pagehttps://haochen-rye.github.io/CNeRV/
Related papers
- VQ-NeRV: A Vector Quantized Neural Representation for Videos [3.6662666629446043]
Implicit neural representations (INR) excel in encoding videos within neural networks, showcasing promise in computer vision tasks like video compression and denoising.
We introduce an advanced U-shaped architecture, Vector Quantized-NeRV (VQ-NeRV), which integrates a novel component--the VQ-NeRV Block.
This block incorporates a codebook mechanism to discretize the network's shallow residual features and inter-frame residual information effectively.
arXiv Detail & Related papers (2024-03-19T03:19:07Z) - NERV++: An Enhanced Implicit Neural Video Representation [11.25130799452367]
We introduce neural representations for videos NeRV++, an enhanced implicit neural video representation.
NeRV++ is more straightforward yet effective enhancement over the original NeRV decoder architecture.
We evaluate our method on UVG, MCL JVC, and Bunny datasets, achieving competitive results for video compression with INRs.
arXiv Detail & Related papers (2024-02-28T13:00:32Z) - Progressive Fourier Neural Representation for Sequential Video
Compilation [75.43041679717376]
Motivated by continual learning, this work investigates how to accumulate and transfer neural implicit representations for multiple complex video data over sequential encoding sessions.
We propose a novel method, Progressive Fourier Neural Representation (PFNR), that aims to find an adaptive and compact sub-module in Fourier space to encode videos in each training session.
We validate our PFNR method on the UVG8/17 and DAVIS50 video sequence benchmarks and achieve impressive performance gains over strong continual learning baselines.
arXiv Detail & Related papers (2023-06-20T06:02:19Z) - HNeRV: A Hybrid Neural Representation for Videos [56.492309149698606]
Implicit neural representations store videos as neural networks.
We propose a Hybrid Neural Representation for Videos (HNeRV)
With content-adaptive embeddings and re-designed architecture, HNeRV outperforms implicit methods in video regression tasks.
arXiv Detail & Related papers (2023-04-05T17:55:04Z) - Towards Scalable Neural Representation for Diverse Videos [68.73612099741956]
Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images.
Existing INR-based methods are limited to encoding a handful of short videos with redundant visual content.
This paper focuses on developing neural representations for encoding long and/or a large number of videos with diverse visual content.
arXiv Detail & Related papers (2023-03-24T16:32:19Z) - Scalable Neural Video Representations with Learnable Positional Features [73.51591757726493]
We show how to train neural representations with learnable positional features (NVP) that effectively amortize a video as latent codes.
We demonstrate the superiority of NVP on the popular UVG benchmark; compared with prior arts, NVP not only trains 2 times faster (less than 5 minutes) but also exceeds their encoding quality as 34.07rightarrow$34.57 (measured with the PSNR metric)
arXiv Detail & Related papers (2022-10-13T08:15:08Z) - NeRV: Neural Representations for Videos [36.00198388959609]
We propose a novel neural representation for videos (NeRV) which encodes videos in neural networks.
NeRV is simply fitting a neural network to video frames and decoding process is a simple feedforward operation.
With such a representation, we can treat videos as neural networks, simplifying several video-related tasks.
arXiv Detail & Related papers (2021-10-26T17:56:23Z) - Neural Rays for Occlusion-aware Image-based Rendering [108.34004858785896]
We present a new neural representation, called Neural Ray (NeuRay), for the novel view synthesis (NVS) task with multi-view images as input.
NeuRay can quickly generate high-quality novel view rendering images of unseen scenes with little finetuning.
arXiv Detail & Related papers (2021-07-28T15:09:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.