Neural Residual Flow Fields for Efficient Video Representations
- URL: http://arxiv.org/abs/2201.04329v1
- Date: Wed, 12 Jan 2022 06:22:09 GMT
- Title: Neural Residual Flow Fields for Efficient Video Representations
- Authors: Daniel Rho, Junwoo Cho, Jong Hwan Ko, Eunbyung Park
- Abstract summary: Implicit neural representation (INR) has emerged as a powerful paradigm for representing signals, such as images, videos, 3D shapes, etc.
We propose a novel INR approach to representing and compressing videos by explicitly removing data redundancy.
We show that the proposed method outperforms the baseline methods by a significant margin.
- Score: 5.904082461511478
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Implicit neural representation (INR) has emerged as a powerful paradigm for
representing signals, such as images, videos, 3D shapes, etc. Although it has
shown the ability to represent fine details, its efficiency as a data
representation has not been extensively studied. In INR, the data is stored in
the form of parameters of a neural network and general purpose optimization
algorithms do not generally exploit the spatial and temporal redundancy in
signals. In this paper, we suggest a novel INR approach to representing and
compressing videos by explicitly removing data redundancy. Instead of storing
raw RGB colors, we propose Neural Residual Flow Fields (NRFF), using motion
information across video frames and residuals that are necessary to reconstruct
a video. Maintaining the motion information, which is usually smoother and less
complex than the raw signals, requires far fewer parameters. Furthermore,
reusing redundant pixel values further improves the network parameter
efficiency. Experimental results have shown that the proposed method
outperforms the baseline methods by a significant margin. The code is available
in https://github.com/daniel03c1/eff_video_representation.
Related papers
- NERV++: An Enhanced Implicit Neural Video Representation [11.25130799452367]
We introduce neural representations for videos NeRV++, an enhanced implicit neural video representation.
NeRV++ is more straightforward yet effective enhancement over the original NeRV decoder architecture.
We evaluate our method on UVG, MCL JVC, and Bunny datasets, achieving competitive results for video compression with INRs.
arXiv Detail & Related papers (2024-02-28T13:00:32Z) - FFEINR: Flow Feature-Enhanced Implicit Neural Representation for
Spatio-temporal Super-Resolution [4.577685231084759]
This paper proposes a Feature-Enhanced Neural Implicit Representation (FFEINR) for super-resolution of flow field data.
It can take full advantage of the implicit neural representation in terms of model structure and sampling resolution.
The training process of FFEINR is facilitated by introducing feature enhancements for the input layer.
arXiv Detail & Related papers (2023-08-24T02:28:18Z) - Rapid-INR: Storage Efficient CPU-free DNN Training Using Implicit Neural Representation [7.539498729072623]
Implicit Neural Representation (INR) is an innovative approach for representing complex shapes or objects without explicitly defining their geometry or surface structure.
Previous research has demonstrated the effectiveness of using neural networks as INR for image compression, showcasing comparable performance to traditional methods such as JPEG.
This paper introduces Rapid-INR, a novel approach that utilizes INR for encoding and compressing images, thereby accelerating neural network training in computer vision tasks.
arXiv Detail & Related papers (2023-06-29T05:49:07Z) - Progressive Fourier Neural Representation for Sequential Video
Compilation [75.43041679717376]
Motivated by continual learning, this work investigates how to accumulate and transfer neural implicit representations for multiple complex video data over sequential encoding sessions.
We propose a novel method, Progressive Fourier Neural Representation (PFNR), that aims to find an adaptive and compact sub-module in Fourier space to encode videos in each training session.
We validate our PFNR method on the UVG8/17 and DAVIS50 video sequence benchmarks and achieve impressive performance gains over strong continual learning baselines.
arXiv Detail & Related papers (2023-06-20T06:02:19Z) - Modality-Agnostic Variational Compression of Implicit Neural
Representations [96.35492043867104]
We introduce a modality-agnostic neural compression algorithm based on a functional view of data and parameterised as an Implicit Neural Representation (INR)
Bridging the gap between latent coding and sparsity, we obtain compact latent representations non-linearly mapped to a soft gating mechanism.
After obtaining a dataset of such latent representations, we directly optimise the rate/distortion trade-off in a modality-agnostic space using neural compression.
arXiv Detail & Related papers (2023-01-23T15:22:42Z) - Versatile Neural Processes for Learning Implicit Neural Representations [57.090658265140384]
We propose Versatile Neural Processes (VNP), which largely increases the capability of approximating functions.
Specifically, we introduce a bottleneck encoder that produces fewer and informative context tokens, relieving the high computational cost.
We demonstrate the effectiveness of the proposed VNP on a variety of tasks involving 1D, 2D and 3D signals.
arXiv Detail & Related papers (2023-01-21T04:08:46Z) - Scalable Neural Video Representations with Learnable Positional Features [73.51591757726493]
We show how to train neural representations with learnable positional features (NVP) that effectively amortize a video as latent codes.
We demonstrate the superiority of NVP on the popular UVG benchmark; compared with prior arts, NVP not only trains 2 times faster (less than 5 minutes) but also exceeds their encoding quality as 34.07rightarrow$34.57 (measured with the PSNR metric)
arXiv Detail & Related papers (2022-10-13T08:15:08Z) - E-NeRV: Expedite Neural Video Representation with Disentangled
Spatial-Temporal Context [14.549945320069892]
We propose E-NeRV, which dramatically expedites NeRV by decomposing the image-wise implicit neural representation into separate spatial and temporal context.
We experimentally find that our method can improve the performance to a large extent with fewer parameters, resulting in a more than $8times$ faster speed on convergence.
arXiv Detail & Related papers (2022-07-17T10:16:47Z) - Variable Bitrate Neural Fields [75.24672452527795]
We present a dictionary method for compressing feature grids, reducing their memory consumption by up to 100x.
We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available.
arXiv Detail & Related papers (2022-06-15T17:58:34Z) - Meta-Learning Sparse Implicit Neural Representations [69.15490627853629]
Implicit neural representations are a promising new avenue of representing general signals.
Current approach is difficult to scale for a large number of signals or a data set.
We show that meta-learned sparse neural representations achieve a much smaller loss than dense meta-learned models.
arXiv Detail & Related papers (2021-10-27T18:02:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.