Progressive Fourier Neural Representation for Sequential Video
Compilation
- URL: http://arxiv.org/abs/2306.11305v3
- Date: Wed, 7 Feb 2024 01:52:30 GMT
- Title: Progressive Fourier Neural Representation for Sequential Video
Compilation
- Authors: Haeyong Kang, Jaehong Yoon, DaHyun Kim, Sung Ju Hwang, and Chang D Yoo
- Abstract summary: Motivated by continual learning, this work investigates how to accumulate and transfer neural implicit representations for multiple complex video data over sequential encoding sessions.
We propose a novel method, Progressive Fourier Neural Representation (PFNR), that aims to find an adaptive and compact sub-module in Fourier space to encode videos in each training session.
We validate our PFNR method on the UVG8/17 and DAVIS50 video sequence benchmarks and achieve impressive performance gains over strong continual learning baselines.
- Score: 75.43041679717376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural Implicit Representation (NIR) has recently gained significant
attention due to its remarkable ability to encode complex and high-dimensional
data into representation space and easily reconstruct it through a trainable
mapping function. However, NIR methods assume a one-to-one mapping between the
target data and representation models regardless of data relevancy or
similarity. This results in poor generalization over multiple complex data and
limits their efficiency and scalability. Motivated by continual learning, this
work investigates how to accumulate and transfer neural implicit
representations for multiple complex video data over sequential encoding
sessions. To overcome the limitation of NIR, we propose a novel method,
Progressive Fourier Neural Representation (PFNR), that aims to find an adaptive
and compact sub-module in Fourier space to encode videos in each training
session. This sparsified neural encoding allows the neural network to hold free
weights, enabling an improved adaptation for future videos. In addition, when
learning a representation for a new video, PFNR transfers the representation of
previous videos with frozen weights. This design allows the model to
continuously accumulate high-quality neural representations for multiple videos
while ensuring lossless decoding that perfectly preserves the learned
representations for previous videos. We validate our PFNR method on the UVG8/17
and DAVIS50 video sequence benchmarks and achieve impressive performance gains
over strong continual learning baselines. The PFNR code is available at
https://github.com/ihaeyong/PFNR.git.
Related papers
- NERV++: An Enhanced Implicit Neural Video Representation [11.25130799452367]
We introduce neural representations for videos NeRV++, an enhanced implicit neural video representation.
NeRV++ is more straightforward yet effective enhancement over the original NeRV decoder architecture.
We evaluate our method on UVG, MCL JVC, and Bunny datasets, achieving competitive results for video compression with INRs.
arXiv Detail & Related papers (2024-02-28T13:00:32Z) - Boosting Neural Representations for Videos with a Conditional Decoder [28.073607937396552]
Implicit neural representations (INRs) have emerged as a promising approach for video storage and processing.
This paper introduces a universal boosting framework for current implicit video representation approaches.
arXiv Detail & Related papers (2024-02-28T08:32:19Z) - Towards Scalable Neural Representation for Diverse Videos [68.73612099741956]
Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images.
Existing INR-based methods are limited to encoding a handful of short videos with redundant visual content.
This paper focuses on developing neural representations for encoding long and/or a large number of videos with diverse visual content.
arXiv Detail & Related papers (2023-03-24T16:32:19Z) - Scalable Neural Video Representations with Learnable Positional Features [73.51591757726493]
We show how to train neural representations with learnable positional features (NVP) that effectively amortize a video as latent codes.
We demonstrate the superiority of NVP on the popular UVG benchmark; compared with prior arts, NVP not only trains 2 times faster (less than 5 minutes) but also exceeds their encoding quality as 34.07rightarrow$34.57 (measured with the PSNR metric)
arXiv Detail & Related papers (2022-10-13T08:15:08Z) - NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction.
The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network.
A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z) - Neural Implicit Dictionary via Mixture-of-Expert Training [111.08941206369508]
We present a generic INR framework that achieves both data and training efficiency by learning a Neural Implicit Dictionary (NID)
Our NID assembles a group of coordinate-based Impworks which are tuned to span the desired function space.
Our experiments show that, NID can improve reconstruction of 2D images or 3D scenes by 2 orders of magnitude faster with up to 98% less input data.
arXiv Detail & Related papers (2022-07-08T05:07:19Z) - Variable Bitrate Neural Fields [75.24672452527795]
We present a dictionary method for compressing feature grids, reducing their memory consumption by up to 100x.
We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available.
arXiv Detail & Related papers (2022-06-15T17:58:34Z) - Neural Residual Flow Fields for Efficient Video Representations [5.904082461511478]
Implicit neural representation (INR) has emerged as a powerful paradigm for representing signals, such as images, videos, 3D shapes, etc.
We propose a novel INR approach to representing and compressing videos by explicitly removing data redundancy.
We show that the proposed method outperforms the baseline methods by a significant margin.
arXiv Detail & Related papers (2022-01-12T06:22:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.