FFNeRV: Flow-Guided Frame-Wise Neural Representations for Videos
- URL: http://arxiv.org/abs/2212.12294v2
- Date: Mon, 7 Aug 2023 01:21:19 GMT
- Title: FFNeRV: Flow-Guided Frame-Wise Neural Representations for Videos
- Authors: Joo Chan Lee, Daniel Rho, Jong Hwan Ko, Eunbyung Park
- Abstract summary: We propose FFNeRV, a novel method for incorporating flow information into frame-wise representations to exploit the temporal redundancy across the frames in videos.
With model compression techniques, FFNeRV outperforms widely-used standard video codecs (H.264 and HEVC) and performs on par with state-of-the-art video compression algorithms.
- Score: 5.958701846880935
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural fields, also known as coordinate-based or implicit neural
representations, have shown a remarkable capability of representing,
generating, and manipulating various forms of signals. For video
representations, however, mapping pixel-wise coordinates to RGB colors has
shown relatively low compression performance and slow convergence and inference
speed. Frame-wise video representation, which maps a temporal coordinate to its
entire frame, has recently emerged as an alternative method to represent
videos, improving compression rates and encoding speed. While promising, it has
still failed to reach the performance of state-of-the-art video compression
algorithms. In this work, we propose FFNeRV, a novel method for incorporating
flow information into frame-wise representations to exploit the temporal
redundancy across the frames in videos inspired by the standard video codecs.
Furthermore, we introduce a fully convolutional architecture, enabled by
one-dimensional temporal grids, improving the continuity of spatial features.
Experimental results show that FFNeRV yields the best performance for video
compression and frame interpolation among the methods using frame-wise
representations or neural fields. To reduce the model size even further, we
devise a more compact convolutional architecture using the group and pointwise
convolutions. With model compression techniques, including quantization-aware
training and entropy coding, FFNeRV outperforms widely-used standard video
codecs (H.264 and HEVC) and performs on par with state-of-the-art video
compression algorithms.
Related papers
- High-Efficiency Neural Video Compression via Hierarchical Predictive Learning [27.41398149573729]
Enhanced Deep Hierarchical Video Compression-DHVC 2.0- introduces superior compression performance and impressive complexity efficiency.
Uses hierarchical predictive coding to transform each video frame into multiscale representations.
Supports transmission-friendly progressive decoding, making it particularly advantageous for networked video applications in the presence of packet loss.
arXiv Detail & Related papers (2024-10-03T15:40:58Z) - Boosting Neural Representations for Videos with a Conditional Decoder [28.073607937396552]
Implicit neural representations (INRs) have emerged as a promising approach for video storage and processing.
This paper introduces a universal boosting framework for current implicit video representation approaches.
arXiv Detail & Related papers (2024-02-28T08:32:19Z) - Accelerated Event-Based Feature Detection and Compression for
Surveillance Video Systems [1.5390526524075634]
We propose a novel system which conveys temporal redundancy within a sparse decompressed representation.
We leverage a video representation framework called ADDER to transcode framed videos to sparse, asynchronous intensity samples.
Our work paves the way for upcoming neuromorphic sensors and is amenable to future applications with spiking neural networks.
arXiv Detail & Related papers (2023-12-13T15:30:29Z) - VNVC: A Versatile Neural Video Coding Framework for Efficient
Human-Machine Vision [59.632286735304156]
It is more efficient to enhance/analyze the coded representations directly without decoding them into pixels.
We propose a versatile neural video coding (VNVC) framework, which targets learning compact representations to support both reconstruction and direct enhancement/analysis.
arXiv Detail & Related papers (2023-06-19T03:04:57Z) - HiNeRV: Video Compression with Hierarchical Encoding-based Neural
Representation [14.088444622391501]
Implicit Representations (INRs) have previously been used to represent and compress image and video content.
Existing INR-based methods have failed to deliver rate quality performance comparable with the state of the art in video compression.
We propose HiNeRV, an INR that combines light weight layers with hierarchical positional encodings.
arXiv Detail & Related papers (2023-06-16T12:59:52Z) - HNeRV: A Hybrid Neural Representation for Videos [56.492309149698606]
Implicit neural representations store videos as neural networks.
We propose a Hybrid Neural Representation for Videos (HNeRV)
With content-adaptive embeddings and re-designed architecture, HNeRV outperforms implicit methods in video regression tasks.
arXiv Detail & Related papers (2023-04-05T17:55:04Z) - Scalable Neural Video Representations with Learnable Positional Features [73.51591757726493]
We show how to train neural representations with learnable positional features (NVP) that effectively amortize a video as latent codes.
We demonstrate the superiority of NVP on the popular UVG benchmark; compared with prior arts, NVP not only trains 2 times faster (less than 5 minutes) but also exceeds their encoding quality as 34.07rightarrow$34.57 (measured with the PSNR metric)
arXiv Detail & Related papers (2022-10-13T08:15:08Z) - Neighbor Correspondence Matching for Flow-based Video Frame Synthesis [90.14161060260012]
We introduce a neighbor correspondence matching (NCM) algorithm for flow-based frame synthesis.
NCM is performed in a current-frame-agnostic fashion to establish multi-scale correspondences in the spatial-temporal neighborhoods of each pixel.
coarse-scale module is designed to leverage neighbor correspondences to capture large motion, while the fine-scale module is more efficient to speed up the estimation process.
arXiv Detail & Related papers (2022-07-14T09:17:00Z) - Conditional Entropy Coding for Efficient Video Compression [82.35389813794372]
We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.
We first show that a simple architecture modeling the entropy between the image latent codes is as competitive as other neural video compression works and video codecs.
We then propose a novel internal learning extension on top of this architecture that brings an additional 10% savings without trading off decoding speed.
arXiv Detail & Related papers (2020-08-20T20:01:59Z) - Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system.
Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame.
Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.