Many-to-many Splatting for Efficient Video Frame Interpolation
- URL: http://arxiv.org/abs/2204.03513v1
- Date: Thu, 7 Apr 2022 15:29:42 GMT
- Title: Many-to-many Splatting for Efficient Video Frame Interpolation
- Authors: Ping Hu, Simon Niklaus, Stan Sclaroff, Kate Saenko
- Abstract summary: Motion-based video frame relies on optical flow to warp pixels from inputs to desired instant.
Many-to-Many (M2M) splatting framework to interpolate frames efficiently.
M2M has minuscule computational overhead when interpolating arbitrary number of in-between frames.
- Score: 80.10804399840927
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Motion-based video frame interpolation commonly relies on optical flow to
warp pixels from the inputs to the desired interpolation instant. Yet due to
the inherent challenges of motion estimation (e.g. occlusions and
discontinuities), most state-of-the-art interpolation approaches require
subsequent refinement of the warped result to generate satisfying outputs,
which drastically decreases the efficiency for multi-frame interpolation. In
this work, we propose a fully differentiable Many-to-Many (M2M) splatting
framework to interpolate frames efficiently. Specifically, given a frame pair,
we estimate multiple bidirectional flows to directly forward warp the pixels to
the desired time step, and then fuse any overlapping pixels. In doing so, each
source pixel renders multiple target pixels and each target pixel can be
synthesized from a larger area of visual context. This establishes a
many-to-many splatting scheme with robustness to artifacts like holes.
Moreover, for each input frame pair, M2M only performs motion estimation once
and has a minuscule computational overhead when interpolating an arbitrary
number of in-between frames, hence achieving fast multi-frame interpolation. We
conducted extensive experiments to analyze M2M, and found that it significantly
improves efficiency while maintaining high effectiveness.
Related papers
- ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler [53.98558445900626]
Current image-to-video diffusion models, while powerful in generating videos from a single frame, need adaptation for two-frame conditioned generation.
We introduce a novel, bidirectional sampling strategy to address these off-manifold issues without requiring extensive re-noising or fine-tuning.
Our method employs sequential sampling along both forward and backward paths, conditioned on the start and end frames, respectively, ensuring more coherent and on-manifold generation of intermediate frames.
arXiv Detail & Related papers (2024-10-08T03:01:54Z) - Video Frame Interpolation with Many-to-many Splatting and Spatial
Selective Refinement [83.60486465697318]
We propose a fully differentiable Many-to-Many (M2M) splatting framework to interpolate frames efficiently.
For each input frame pair, M2M has a minuscule computational overhead when interpolating an arbitrary number of in-between frames.
We extend an M2M++ framework by introducing a flexible Spatial Selective Refinement component, which allows for trading computational efficiency for quality and vice versa.
arXiv Detail & Related papers (2023-10-29T09:09:32Z) - Dynamic Frame Interpolation in Wavelet Domain [57.25341639095404]
Video frame is an important low-level computation vision task, which can increase frame rate for more fluent visual experience.
Existing methods have achieved great success by employing advanced motion models and synthesis networks.
WaveletVFI can reduce computation up to 40% while maintaining similar accuracy, making it perform more efficiently against other state-of-the-arts.
arXiv Detail & Related papers (2023-09-07T06:41:15Z) - Efficient Video Deblurring Guided by Motion Magnitude [37.25713728458234]
We propose a novel framework that utilizes the motion magnitude prior (MMP) as guidance for efficient deep video deblurring.
The MMP consists of both spatial and temporal blur level information, which can be further integrated into an efficient recurrent neural network (RNN) for video deblurring.
arXiv Detail & Related papers (2022-07-27T08:57:48Z) - FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation [97.99012124785177]
FLAVR is a flexible and efficient architecture that uses 3D space-time convolutions to enable end-to-end learning and inference for video framesupervised.
We demonstrate that FLAVR can serve as a useful self- pretext task for action recognition, optical flow estimation, and motion magnification.
arXiv Detail & Related papers (2020-12-15T18:59:30Z) - All at Once: Temporally Adaptive Multi-Frame Interpolation with Advanced
Motion Modeling [52.425236515695914]
State-of-the-art methods are iterative solutions interpolating one frame at the time.
This work introduces a true multi-frame interpolator.
It utilizes a pyramidal style network in the temporal domain to complete the multi-frame task in one-shot.
arXiv Detail & Related papers (2020-07-23T02:34:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.