Related papers: Dynamic Frame Interpolation in Wavelet Domain

Dynamic Frame Interpolation in Wavelet Domain

URL: http://arxiv.org/abs/2309.03508v2
Date: Thu, 21 Sep 2023 02:15:08 GMT
Title: Dynamic Frame Interpolation in Wavelet Domain
Authors: Lingtong Kong, Boyuan Jiang, Donghao Luo, Wenqing Chu, Ying Tai, Chengjie Wang, Jie Yang
Abstract summary: Video frame is an important low-level computation vision task, which can increase frame rate for more fluent visual experience. Existing methods have achieved great success by employing advanced motion models and synthesis networks. WaveletVFI can reduce computation up to 40% while maintaining similar accuracy, making it perform more efficiently against other state-of-the-arts.
Score: 57.25341639095404
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Video frame interpolation is an important low-level vision task, which can increase frame rate for more fluent visual experience. Existing methods have achieved great success by employing advanced motion models and synthesis networks. However, the spatial redundancy when synthesizing the target frame has not been fully explored, that can result in lots of inefficient computation. On the other hand, the computation compression degree in frame interpolation is highly dependent on both texture distribution and scene motion, which demands to understand the spatial-temporal information of each input frame pair for a better compression degree selection. In this work, we propose a novel two-stage frame interpolation framework termed WaveletVFI to address above problems. It first estimates intermediate optical flow with a lightweight motion perception network, and then a wavelet synthesis network uses flow aligned context features to predict multi-scale wavelet coefficients with sparse convolution for efficient target frame reconstruction, where the sparse valid masks that control computation in each scale are determined by a crucial threshold ratio. Instead of setting a fixed value like previous methods, we find that embedding a classifier in the motion perception network to learn a dynamic threshold for each sample can achieve more computation reduction with almost no loss of accuracy. On the common high resolution and animation frame interpolation benchmarks, proposed WaveletVFI can reduce computation up to 40% while maintaining similar accuracy, making it perform more efficiently against other state-of-the-arts. Code is available at https://github.com/ltkong218/WaveletVFI.

Related papers

Motion-Aware Video Frame Interpolation [49.49668436390514]
We introduce a Motion-Aware Video Frame Interpolation (MA-VFI) network, which directly estimates intermediate optical flow from consecutive frames. It not only extracts global semantic relationships and spatial details from input frames with different receptive fields, but also effectively reduces the required computational cost and complexity.
arXiv Detail & Related papers (2024-02-05T11:00:14Z)
Video Frame Interpolation with Many-to-many Splatting and Spatial Selective Refinement [83.60486465697318]
We propose a fully differentiable Many-to-Many (M2M) splatting framework to interpolate frames efficiently. For each input frame pair, M2M has a minuscule computational overhead when interpolating an arbitrary number of in-between frames. We extend an M2M++ framework by introducing a flexible Spatial Selective Refinement component, which allows for trading computational efficiency for quality and vice versa.
arXiv Detail & Related papers (2023-10-29T09:09:32Z)
IDO-VFI: Identifying Dynamics via Optical Flow Guidance for Video Frame Interpolation with Events [14.098949778274733]
Event cameras are ideal for capturing inter-frame dynamics with their extremely high temporal resolution. We propose an event-and-frame-based video frame method named IDO-VFI that assigns varying amounts of computation for different sub-regions. Our proposed method maintains high-quality performance while reducing computation time and computational effort by 10% and 17% respectively on Vimeo90K datasets.
arXiv Detail & Related papers (2023-05-17T13:22:21Z)
Progressive Motion Context Refine Network for Efficient Video Frame Interpolation [10.369068266836154]
Flow-based frame methods have achieved great success by first modeling optical flow between target and input frames, and then building synthesis network for target frame generation. We propose a novel Progressive Motion Context Refine Network (PMCRNet) to predict motion fields and image context jointly for higher efficiency. Experiments on multiple benchmarks show that proposed approaches not only achieve favorable and quantitative results but also reduces model size and running time significantly.
arXiv Detail & Related papers (2022-11-11T06:29:03Z)
Enhanced Bi-directional Motion Estimation for Video Frame Interpolation [0.05541644538483946]
We present a novel yet effective algorithm for motion-based video frame estimation. Our method achieves excellent performance on a broad range of video frame benchmarks.
arXiv Detail & Related papers (2022-06-17T06:08:43Z)
Long-term Video Frame Interpolation via Feature Propagation [95.18170372022703]
Video frame (VFI) works generally predict intermediate frame(s) by first estimating the motion between inputs and then warping the inputs to the target time with the estimated motion. This approach is not optimal when the temporal distance between the input sequence increases. We propose a propagation network (PNet) by extending the classic feature-level forecasting with a novel motion-to-feature approach.
arXiv Detail & Related papers (2022-03-29T10:47:06Z)
FILM: Frame Interpolation for Large Motion [20.04001872133824]
We present a frame algorithm that synthesizes multiple intermediate frames from two input images with large in-between motion. Our approach outperforms state-of-the-art methods on the Xiph large motion benchmark.
arXiv Detail & Related papers (2022-02-10T08:48:18Z)
FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation [97.99012124785177]
FLAVR is a flexible and efficient architecture that uses 3D space-time convolutions to enable end-to-end learning and inference for video framesupervised. We demonstrate that FLAVR can serve as a useful self- pretext task for action recognition, optical flow estimation, and motion magnification.
arXiv Detail & Related papers (2020-12-15T18:59:30Z)
All at Once: Temporally Adaptive Multi-Frame Interpolation with Advanced Motion Modeling [52.425236515695914]
State-of-the-art methods are iterative solutions interpolating one frame at the time. This work introduces a true multi-frame interpolator. It utilizes a pyramidal style network in the temporal domain to complete the multi-frame task in one-shot.
arXiv Detail & Related papers (2020-07-23T02:34:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.