Bias for Action: Video Implicit Neural Representations with Bias Modulation
- URL: http://arxiv.org/abs/2501.09277v2
- Date: Fri, 06 Jun 2025 22:46:55 GMT
- Title: Bias for Action: Video Implicit Neural Representations with Bias Modulation
- Authors: Alper Kayabasi, Anil Kumar Vadathya, Guha Balakrishnan, Vishwanath Saragadam,
- Abstract summary: We propose a new continuous video modeling framework based on implicit neural representations (INRs) called ActINR.<n>By training the video INR and this bias INR together, we demonstrate unique capabilities, including $10times$ video slow motion, 4x spatial super resolution along with 2x slow motion, denoising, and video inpainting.
- Score: 8.940264163876968
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a new continuous video modeling framework based on implicit neural representations (INRs) called ActINR. At the core of our approach is the observation that INRs can be considered as a learnable dictionary, with the shapes of the basis functions governed by the weights of the INR, and their locations governed by the biases. Given compact non-linear activation functions, we hypothesize that an INR's biases are suitable to capture motion across images, and facilitate compact representations for video sequences. Using these observations, we design ActINR to share INR weights across frames of a video sequence, while using unique biases for each frame. We further model the biases as the output of a separate INR conditioned on time index to promote smoothness. By training the video INR and this bias INR together, we demonstrate unique capabilities, including $10\times$ video slow motion, 4x spatial super resolution along with 2x slow motion, denoising, and video inpainting. ActINR performs remarkably well across numerous video processing tasks (often achieving more than 6dB improvement), setting a new standard for continuous modeling of videos.
Related papers
- SR-NeRV: Improving Embedding Efficiency of Neural Video Representation via Super-Resolution [0.0]
Implicit Neural Representations (INRs) have garnered significant attention for their ability to model complex signals in various domains.<n>We propose an INR-based video representation framework that integrates a general-purpose super-resolution (SR) network.<n>By offloading the reconstruction of fine details to a dedicated SR network pre-trained on natural images, the proposed method improves visual fidelity.
arXiv Detail & Related papers (2025-04-30T03:31:40Z) - CANeRV: Content Adaptive Neural Representation for Video Compression [89.35616046528624]
We propose Content Adaptive Neural Representation for Video Compression (CANeRV)<n>CANeRV is an innovative INR-based video compression network that adaptively conducts structure optimisation based on the specific content of each video sequence.<n>We show that CANeRV can outperform both H.266/VVC and state-of-the-art INR-based video compression techniques across diverse video datasets.
arXiv Detail & Related papers (2025-02-10T06:21:16Z) - CoordFlow: Coordinate Flow for Pixel-wise Neural Video Representation [11.364753833652182]
Implicit Neural Representation (INR) is a promising alternative to traditional transform-based methodologies.<n>We introduce CoordFlow, a novel pixel-wise INR for video compression.<n>It yields state-of-the-art results compared to other pixel-wise INRs and on-par performance compared to leading frame-wise techniques.
arXiv Detail & Related papers (2025-01-01T22:58:06Z) - PNeRV: A Polynomial Neural Representation for Videos [28.302862266270093]
Extracting Implicit Neural Representations on video poses unique challenges due to the additional temporal dimension.
We introduce Polynomial Neural Representation for Videos (PNeRV)
PNeRV mitigates challenges posed by video data in the realm of INRs but opens new avenues for advanced video processing and analysis.
arXiv Detail & Related papers (2024-06-27T16:15:22Z) - NERV++: An Enhanced Implicit Neural Video Representation [11.25130799452367]
We introduce neural representations for videos NeRV++, an enhanced implicit neural video representation.
NeRV++ is more straightforward yet effective enhancement over the original NeRV decoder architecture.
We evaluate our method on UVG, MCL JVC, and Bunny datasets, achieving competitive results for video compression with INRs.
arXiv Detail & Related papers (2024-02-28T13:00:32Z) - Progressive Fourier Neural Representation for Sequential Video
Compilation [75.43041679717376]
Motivated by continual learning, this work investigates how to accumulate and transfer neural implicit representations for multiple complex video data over sequential encoding sessions.
We propose a novel method, Progressive Fourier Neural Representation (PFNR), that aims to find an adaptive and compact sub-module in Fourier space to encode videos in each training session.
We validate our PFNR method on the UVG8/17 and DAVIS50 video sequence benchmarks and achieve impressive performance gains over strong continual learning baselines.
arXiv Detail & Related papers (2023-06-20T06:02:19Z) - DNeRV: Modeling Inherent Dynamics via Difference Neural Representation
for Videos [53.077189668346705]
Difference Representation for Videos (eRV)
We analyze this from the perspective of limitation function fitting and the importance of frame difference.
DNeRV achieves competitive results against the state-of-the-art neural compression approaches.
arXiv Detail & Related papers (2023-04-13T13:53:49Z) - Towards Scalable Neural Representation for Diverse Videos [68.73612099741956]
Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images.
Existing INR-based methods are limited to encoding a handful of short videos with redundant visual content.
This paper focuses on developing neural representations for encoding long and/or a large number of videos with diverse visual content.
arXiv Detail & Related papers (2023-03-24T16:32:19Z) - INR-V: A Continuous Representation Space for Video-based Generative
Tasks [43.245717657048296]
We propose INR-V, a video representation network that learns a continuous space for video-based generative tasks.
The representation space learned by INR-V is more expressive than an image space showcasing many interesting properties not possible with the existing works.
arXiv Detail & Related papers (2022-10-29T11:54:58Z) - Scalable Neural Video Representations with Learnable Positional Features [73.51591757726493]
We show how to train neural representations with learnable positional features (NVP) that effectively amortize a video as latent codes.
We demonstrate the superiority of NVP on the popular UVG benchmark; compared with prior arts, NVP not only trains 2 times faster (less than 5 minutes) but also exceeds their encoding quality as 34.07rightarrow$34.57 (measured with the PSNR metric)
arXiv Detail & Related papers (2022-10-13T08:15:08Z) - VideoINR: Learning Video Implicit Neural Representation for Continuous
Space-Time Super-Resolution [75.79379734567604]
We show that Video Implicit Neural Representation (VideoINR) can be decoded to videos of arbitrary spatial resolution and frame rate.
We show that VideoINR achieves competitive performances with state-of-the-art STVSR methods on common up-sampling scales.
arXiv Detail & Related papers (2022-06-09T17:45:49Z) - Generating Videos with Dynamics-aware Implicit Generative Adversarial
Networks [68.93429034530077]
We propose dynamics-aware implicit generative adversarial network (DIGAN) for video generation.
We show that DIGAN can be trained on 128 frame videos of 128x128 resolution, 80 frames longer than the 48 frames of the previous state-of-the-art method.
arXiv Detail & Related papers (2022-02-21T23:24:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.