NIO: Lightweight neural operator-based architecture for video frame
interpolation
- URL: http://arxiv.org/abs/2211.10791v1
- Date: Sat, 19 Nov 2022 20:30:47 GMT
- Title: NIO: Lightweight neural operator-based architecture for video frame
interpolation
- Authors: Hrishikesh Viswanath, Md Ashiqur Rahman, Rashmi Bhaskara, Aniket Bera
- Abstract summary: NIO is a lightweight, efficient neural operator-based architecture to perform video frame-by-frame learning.
We show that NIO can produce visually-smooth and accurate results and converges in fewer epochs than state-of-the-art approaches.
- Score: 15.875579519177487
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: We present, NIO - Neural Interpolation Operator, a lightweight efficient
neural operator-based architecture to perform video frame interpolation.
Current deep learning based methods rely on local convolutions for feature
learning and require a large amount of training on comprehensive datasets.
Furthermore, transformer-based architectures are large and need dedicated GPUs
for training. On the other hand, NIO, our neural operator-based approach learns
the features in the frames by translating the image matrix into the Fourier
space by using Fast Fourier Transform (FFT). The model performs global
convolution, making it discretization invariant. We show that NIO can produce
visually-smooth and accurate results and converges in fewer epochs than
state-of-the-art approaches. To evaluate the visual quality of our interpolated
frames, we calculate the structural similarity index (SSIM) and Peak Signal to
Noise Ratio (PSNR) between the generated frame and the ground truth frame. We
provide the quantitative performance of our model on Vimeo-90K dataset, DAVIS,
UCF101 and DISFA+ dataset.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation [64.34935748707673]
Recent deep neural networks (DNNs) have made impressive progress in performance by introducing learned data priors.
We propose a novel method of Learning Resampling (termed LeRF) which takes advantage of both the structural priors learned by DNNs and the locally continuous assumption.
LeRF assigns spatially varying resampling functions to input image pixels and learns to predict the shapes of these resampling functions with a neural network.
arXiv Detail & Related papers (2024-07-13T16:09:45Z) - fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial Intelligence [50.417261057533786]
fVDB is a novel framework for deep learning on large-scale 3D data.
Our framework is fully integrated with PyTorch enabling interoperability with existing pipelines.
arXiv Detail & Related papers (2024-07-01T20:20:33Z) - TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - NAR-Former: Neural Architecture Representation Learning towards Holistic
Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically.
Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z) - Cross-Attention Transformer for Video Interpolation [3.5317804902980527]
TAIN (Transformers and Attention for video INterpolation) aims to interpolate an intermediate frame given two consecutive image frames around it.
We first present a novel visual transformer module, named Cross-Similarity (CS), to globally aggregate input image features with similar appearance as those of the predicted frame.
To account for occlusions in the CS features, we propose an Image Attention (IA) module to allow the network to focus on CS features from one frame over those of the other.
arXiv Detail & Related papers (2022-07-08T21:38:54Z) - SUMD: Super U-shaped Matrix Decomposition Convolutional neural network
for Image denoising [0.0]
We introduce the matrix decomposition module(MD) in the network to establish the global context feature.
Inspired by the design of multi-stage progressive restoration of U-shaped architecture, we further integrate the MD module into the multi-branches.
Our model(SUMD) can produce comparable visual quality and accuracy results with Transformer-based methods.
arXiv Detail & Related papers (2022-04-11T04:38:34Z) - An Efficient Pattern Mining Convolution Neural Network (CNN) algorithm
with Grey Wolf Optimization (GWO) [0.0]
This paper proposed a novel model of feature analysis method with the CNN based on Convoluted Pattern of Wavelet Transform (CPWT) feature vectors.
The performance of this proposed method can be validated by comparing with traditional state-of-art methods.
arXiv Detail & Related papers (2022-04-10T15:18:42Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.