Video Frame Interpolation via Structure-Motion based Iterative Fusion
- URL: http://arxiv.org/abs/2105.05353v1
- Date: Tue, 11 May 2021 22:11:17 GMT
- Title: Video Frame Interpolation via Structure-Motion based Iterative Fusion
- Authors: Xi Li, Meng Cao, Yingying Tang, Scott Johnston, Zhendong Hong, Huimin
Ma, Jiulong Shan
- Abstract summary: We propose a structure-motion based iterative fusion method for video frame Interpolation.
Inspired by the observation that audiences have different visual preferences on foreground and background objects, we for the first time propose to use saliency masks in the evaluation processes of the task of video frame Interpolation.
- Score: 19.499969588931414
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Video Frame Interpolation synthesizes non-existent images between adjacent
frames, with the aim of providing a smooth and consistent visual experience.
Two approaches for solving this challenging task are optical flow based and
kernel-based methods. In existing works, optical flow based methods can provide
accurate point-to-point motion description, however, they lack constraints on
object structure. On the contrary, kernel-based methods focus on structural
alignment, which relies on semantic and apparent features, but tends to blur
results. Based on these observations, we propose a structure-motion based
iterative fusion method. The framework is an end-to-end learnable structure
with two stages. First, interpolated frames are synthesized by structure-based
and motion-based learning branches respectively, then, an iterative refinement
module is established via spatial and temporal feature integration. Inspired by
the observation that audiences have different visual preferences on foreground
and background objects, we for the first time propose to use saliency masks in
the evaluation processes of the task of video frame interpolation. Experimental
results on three typical benchmarks show that the proposed method achieves
superior performance on all evaluation metrics over the state-of-the-art
methods, even when our models are trained with only one-tenth of the data other
methods use.
Related papers
- Motion-Aware Video Frame Interpolation [49.49668436390514]
We introduce a Motion-Aware Video Frame Interpolation (MA-VFI) network, which directly estimates intermediate optical flow from consecutive frames.
It not only extracts global semantic relationships and spatial details from input frames with different receptive fields, but also effectively reduces the required computational cost and complexity.
arXiv Detail & Related papers (2024-02-05T11:00:14Z) - Tsanet: Temporal and Scale Alignment for Unsupervised Video Object
Segmentation [21.19216164433897]
Unsupervised Video Object (UVOS) refers to the challenging task of segmenting the prominent object in videos without manual guidance.
We propose a novel framework for UVOS that can address the aforementioned limitations of the two approaches.
We present experimental results on public benchmark datasets, DAVIS 2016 and FBMS, which demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2023-03-08T04:59:43Z) - Meta-Interpolation: Time-Arbitrary Frame Interpolation via Dual
Meta-Learning [65.85319901760478]
We consider processing different time-steps with adaptively generated convolutional kernels in a unified way with the help of meta-learning.
We develop a dual meta-learned frame framework to synthesize intermediate frames with the guidance of context information and optical flow.
arXiv Detail & Related papers (2022-07-27T17:36:23Z) - TimeLens: Event-based Video Frame Interpolation [54.28139783383213]
We introduce Time Lens, a novel indicates equal contribution method that leverages the advantages of both synthesis-based and flow-based approaches.
We show an up to 5.21 dB improvement in terms of PSNR over state-of-the-art frame-based and event-based methods.
arXiv Detail & Related papers (2021-06-14T10:33:47Z) - EA-Net: Edge-Aware Network for Flow-based Video Frame Interpolation [101.75999290175412]
We propose to reduce the image blur and get the clear shape of objects by preserving the edges in the interpolated frames.
The proposed Edge-Aware Network (EANet) integrates the edge information into the frame task.
Three edge-aware mechanisms are developed to emphasize the frame edges in estimating flow maps.
arXiv Detail & Related papers (2021-05-17T08:44:34Z) - Video Frame Interpolation via Generalized Deformable Convolution [18.357839820102683]
Video frame aims at synthesizing intermediate frames from nearby source frames while maintaining spatial and temporal consistencies.
Existing deeplearning-based video frame methods can be divided into two categories: flow-based methods and kernel-based methods.
A novel mechanism named generalized deformable convolution is proposed, which can effectively learn motion in a data-driven manner and freely select sampling points in space-time.
arXiv Detail & Related papers (2020-08-24T20:00:39Z) - Image Matching across Wide Baselines: From Paper to Practice [80.9424750998559]
We introduce a comprehensive benchmark for local features and robust estimation algorithms.
Our pipeline's modular structure allows easy integration, configuration, and combination of different methods.
We show that with proper settings, classical solutions may still outperform the perceived state of the art.
arXiv Detail & Related papers (2020-03-03T15:20:57Z) - Efficient Semantic Video Segmentation with Per-frame Inference [117.97423110566963]
In this work, we process efficient semantic video segmentation in a per-frame fashion during the inference process.
We employ compact models for real-time execution. To narrow the performance gap between compact models and large models, new knowledge distillation methods are designed.
arXiv Detail & Related papers (2020-02-26T12:24:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.