Dynamic Frame Interpolation in Wavelet Domain
- URL: http://arxiv.org/abs/2309.03508v2
- Date: Thu, 21 Sep 2023 02:15:08 GMT
- Title: Dynamic Frame Interpolation in Wavelet Domain
- Authors: Lingtong Kong, Boyuan Jiang, Donghao Luo, Wenqing Chu, Ying Tai,
Chengjie Wang, Jie Yang
- Abstract summary: Video frame is an important low-level computation vision task, which can increase frame rate for more fluent visual experience.
Existing methods have achieved great success by employing advanced motion models and synthesis networks.
WaveletVFI can reduce computation up to 40% while maintaining similar accuracy, making it perform more efficiently against other state-of-the-arts.
- Score: 57.25341639095404
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video frame interpolation is an important low-level vision task, which can
increase frame rate for more fluent visual experience. Existing methods have
achieved great success by employing advanced motion models and synthesis
networks. However, the spatial redundancy when synthesizing the target frame
has not been fully explored, that can result in lots of inefficient
computation. On the other hand, the computation compression degree in frame
interpolation is highly dependent on both texture distribution and scene
motion, which demands to understand the spatial-temporal information of each
input frame pair for a better compression degree selection. In this work, we
propose a novel two-stage frame interpolation framework termed WaveletVFI to
address above problems. It first estimates intermediate optical flow with a
lightweight motion perception network, and then a wavelet synthesis network
uses flow aligned context features to predict multi-scale wavelet coefficients
with sparse convolution for efficient target frame reconstruction, where the
sparse valid masks that control computation in each scale are determined by a
crucial threshold ratio. Instead of setting a fixed value like previous
methods, we find that embedding a classifier in the motion perception network
to learn a dynamic threshold for each sample can achieve more computation
reduction with almost no loss of accuracy. On the common high resolution and
animation frame interpolation benchmarks, proposed WaveletVFI can reduce
computation up to 40% while maintaining similar accuracy, making it perform
more efficiently against other state-of-the-arts. Code is available at
https://github.com/ltkong218/WaveletVFI.
Related papers
- Motion-Aware Video Frame Interpolation [49.49668436390514]
We introduce a Motion-Aware Video Frame Interpolation (MA-VFI) network, which directly estimates intermediate optical flow from consecutive frames.
It not only extracts global semantic relationships and spatial details from input frames with different receptive fields, but also effectively reduces the required computational cost and complexity.
arXiv Detail & Related papers (2024-02-05T11:00:14Z) - Video Frame Interpolation with Many-to-many Splatting and Spatial
Selective Refinement [83.60486465697318]
We propose a fully differentiable Many-to-Many (M2M) splatting framework to interpolate frames efficiently.
For each input frame pair, M2M has a minuscule computational overhead when interpolating an arbitrary number of in-between frames.
We extend an M2M++ framework by introducing a flexible Spatial Selective Refinement component, which allows for trading computational efficiency for quality and vice versa.
arXiv Detail & Related papers (2023-10-29T09:09:32Z) - IDO-VFI: Identifying Dynamics via Optical Flow Guidance for Video Frame
Interpolation with Events [14.098949778274733]
Event cameras are ideal for capturing inter-frame dynamics with their extremely high temporal resolution.
We propose an event-and-frame-based video frame method named IDO-VFI that assigns varying amounts of computation for different sub-regions.
Our proposed method maintains high-quality performance while reducing computation time and computational effort by 10% and 17% respectively on Vimeo90K datasets.
arXiv Detail & Related papers (2023-05-17T13:22:21Z) - Progressive Motion Context Refine Network for Efficient Video Frame
Interpolation [10.369068266836154]
Flow-based frame methods have achieved great success by first modeling optical flow between target and input frames, and then building synthesis network for target frame generation.
We propose a novel Progressive Motion Context Refine Network (PMCRNet) to predict motion fields and image context jointly for higher efficiency.
Experiments on multiple benchmarks show that proposed approaches not only achieve favorable and quantitative results but also reduces model size and running time significantly.
arXiv Detail & Related papers (2022-11-11T06:29:03Z) - Enhanced Bi-directional Motion Estimation for Video Frame Interpolation [0.05541644538483946]
We present a novel yet effective algorithm for motion-based video frame estimation.
Our method achieves excellent performance on a broad range of video frame benchmarks.
arXiv Detail & Related papers (2022-06-17T06:08:43Z) - Long-term Video Frame Interpolation via Feature Propagation [95.18170372022703]
Video frame (VFI) works generally predict intermediate frame(s) by first estimating the motion between inputs and then warping the inputs to the target time with the estimated motion.
This approach is not optimal when the temporal distance between the input sequence increases.
We propose a propagation network (PNet) by extending the classic feature-level forecasting with a novel motion-to-feature approach.
arXiv Detail & Related papers (2022-03-29T10:47:06Z) - FILM: Frame Interpolation for Large Motion [20.04001872133824]
We present a frame algorithm that synthesizes multiple intermediate frames from two input images with large in-between motion.
Our approach outperforms state-of-the-art methods on the Xiph large motion benchmark.
arXiv Detail & Related papers (2022-02-10T08:48:18Z) - FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation [97.99012124785177]
FLAVR is a flexible and efficient architecture that uses 3D space-time convolutions to enable end-to-end learning and inference for video framesupervised.
We demonstrate that FLAVR can serve as a useful self- pretext task for action recognition, optical flow estimation, and motion magnification.
arXiv Detail & Related papers (2020-12-15T18:59:30Z) - All at Once: Temporally Adaptive Multi-Frame Interpolation with Advanced
Motion Modeling [52.425236515695914]
State-of-the-art methods are iterative solutions interpolating one frame at the time.
This work introduces a true multi-frame interpolator.
It utilizes a pyramidal style network in the temporal domain to complete the multi-frame task in one-shot.
arXiv Detail & Related papers (2020-07-23T02:34:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.