A Multi-In-Single-Out Network for Video Frame Interpolation without
Optical Flow
- URL: http://arxiv.org/abs/2311.11602v3
- Date: Tue, 5 Dec 2023 00:07:12 GMT
- Title: A Multi-In-Single-Out Network for Video Frame Interpolation without
Optical Flow
- Authors: Jaemin Lee, Minseok Seo, Sangwoo Lee, Hyobin Park, Dong-Geol Choi
- Abstract summary: deep learning-based video frame (VFI) methods have predominantly focused on estimating motion between two input frames.
We propose a multi-in-single-out (MISO) based VFI method that does not rely on motion vector estimation.
We introduce a novel motion perceptual loss that enables MISO-VFI to better capture the vectors-temporal within the video frames.
- Score: 14.877766449009119
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In general, deep learning-based video frame interpolation (VFI) methods have
predominantly focused on estimating motion vectors between two input frames and
warping them to the target time. While this approach has shown impressive
performance for linear motion between two input frames, it exhibits limitations
when dealing with occlusions and nonlinear movements. Recently, generative
models have been applied to VFI to address these issues. However, as VFI is not
a task focused on generating plausible images, but rather on predicting
accurate intermediate frames between two given frames, performance limitations
still persist. In this paper, we propose a multi-in-single-out (MISO) based VFI
method that does not rely on motion vector estimation, allowing it to
effectively model occlusions and nonlinear motion. Additionally, we introduce a
novel motion perceptual loss that enables MISO-VFI to better capture the
spatio-temporal correlations within the video frames. Our MISO-VFI method
achieves state-of-the-art results on VFI benchmarks Vimeo90K, Middlebury, and
UCF101, with a significant performance gap compared to existing approaches.
Related papers
- Generalizable Implicit Motion Modeling for Video Frame Interpolation [51.966062283735596]
Motion is critical in flow-based Video Frame Interpolation (VFI)
General Implicit Motion Modeling (IMM) is a novel and effective approach to motion modeling VFI.
Our GIMM can be smoothly integrated with existing flow-based VFI works without further modifications.
arXiv Detail & Related papers (2024-07-11T17:13:15Z) - Motion-aware Latent Diffusion Models for Video Frame Interpolation [51.78737270917301]
Motion estimation between neighboring frames plays a crucial role in avoiding motion ambiguity.
We propose a novel diffusion framework, motion-aware latent diffusion models (MADiff)
Our method achieves state-of-the-art performance significantly outperforming existing approaches.
arXiv Detail & Related papers (2024-04-21T05:09:56Z) - Motion-Aware Video Frame Interpolation [49.49668436390514]
We introduce a Motion-Aware Video Frame Interpolation (MA-VFI) network, which directly estimates intermediate optical flow from consecutive frames.
It not only extracts global semantic relationships and spatial details from input frames with different receptive fields, but also effectively reduces the required computational cost and complexity.
arXiv Detail & Related papers (2024-02-05T11:00:14Z) - Boost Video Frame Interpolation via Motion Adaptation [73.42573856943923]
Video frame (VFI) is a challenging task that aims to generate intermediate frames between two consecutive frames in a video.
Existing learning-based VFI methods have achieved great success, but they still suffer from limited generalization ability.
We propose a novel optimization-based VFI method that can adapt to unseen motions at test time.
arXiv Detail & Related papers (2023-06-24T10:44:02Z) - IDO-VFI: Identifying Dynamics via Optical Flow Guidance for Video Frame
Interpolation with Events [14.098949778274733]
Event cameras are ideal for capturing inter-frame dynamics with their extremely high temporal resolution.
We propose an event-and-frame-based video frame method named IDO-VFI that assigns varying amounts of computation for different sub-regions.
Our proposed method maintains high-quality performance while reducing computation time and computational effort by 10% and 17% respectively on Vimeo90K datasets.
arXiv Detail & Related papers (2023-05-17T13:22:21Z) - Long-term Video Frame Interpolation via Feature Propagation [95.18170372022703]
Video frame (VFI) works generally predict intermediate frame(s) by first estimating the motion between inputs and then warping the inputs to the target time with the estimated motion.
This approach is not optimal when the temporal distance between the input sequence increases.
We propose a propagation network (PNet) by extending the classic feature-level forecasting with a novel motion-to-feature approach.
arXiv Detail & Related papers (2022-03-29T10:47:06Z) - FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation [97.99012124785177]
FLAVR is a flexible and efficient architecture that uses 3D space-time convolutions to enable end-to-end learning and inference for video framesupervised.
We demonstrate that FLAVR can serve as a useful self- pretext task for action recognition, optical flow estimation, and motion magnification.
arXiv Detail & Related papers (2020-12-15T18:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.