Event-Based Video Frame Interpolation With Cross-Modal Asymmetric Bidirectional Motion Fields
- URL: http://arxiv.org/abs/2502.13716v1
- Date: Wed, 19 Feb 2025 13:40:43 GMT
- Title: Event-Based Video Frame Interpolation With Cross-Modal Asymmetric Bidirectional Motion Fields
- Authors: Taewoo Kim, Yujeong Chae, Hyun-Kurl Jang, Kuk-Jin Yoon,
- Abstract summary: Video Frame Interpolation (VFI) aims to generate intermediate video frames between consecutive input frames.
We propose a novel event-based VFI framework with cross-modal asymmetric bidirectional motion field estimation.
Our method shows significant performance improvement over the state-of-the-art VFI methods on various datasets.
- Score: 39.214857326425204
- License:
- Abstract: Video Frame Interpolation (VFI) aims to generate intermediate video frames between consecutive input frames. Since the event cameras are bio-inspired sensors that only encode brightness changes with a micro-second temporal resolution, several works utilized the event camera to enhance the performance of VFI. However, existing methods estimate bidirectional inter-frame motion fields with only events or approximations, which can not consider the complex motion in real-world scenarios. In this paper, we propose a novel event-based VFI framework with cross-modal asymmetric bidirectional motion field estimation. In detail, our EIF-BiOFNet utilizes each valuable characteristic of the events and images for direct estimation of inter-frame motion fields without any approximation methods. Moreover, we develop an interactive attention-based frame synthesis network to efficiently leverage the complementary warping-based and synthesis-based features. Finally, we build a large-scale event-based VFI dataset, ERF-X170FPS, with a high frame rate, extreme motion, and dynamic textures to overcome the limitations of previous event-based VFI datasets. Extensive experimental results validate that our method shows significant performance improvement over the state-of-the-art VFI methods on various datasets. Our project pages are available at: https://github.com/intelpro/CBMNet
Related papers
- Event-based Video Frame Interpolation with Edge Guided Motion Refinement [28.331148083668857]
We introduce an end-to-end E-VFI learning method to efficiently utilize edge features from event signals for motion flow and warping enhancement.
Our method incorporates an Edge Guided Attentive (EGA) module, which rectifies estimated video motion through attentive aggregation.
Experiments on both synthetic and real datasets show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-04-28T12:13:34Z) - Motion-aware Latent Diffusion Models for Video Frame Interpolation [51.78737270917301]
Motion estimation between neighboring frames plays a crucial role in avoiding motion ambiguity.
We propose a novel diffusion framework, motion-aware latent diffusion models (MADiff)
Our method achieves state-of-the-art performance significantly outperforming existing approaches.
arXiv Detail & Related papers (2024-04-21T05:09:56Z) - Motion-Aware Video Frame Interpolation [49.49668436390514]
We introduce a Motion-Aware Video Frame Interpolation (MA-VFI) network, which directly estimates intermediate optical flow from consecutive frames.
It not only extracts global semantic relationships and spatial details from input frames with different receptive fields, but also effectively reduces the required computational cost and complexity.
arXiv Detail & Related papers (2024-02-05T11:00:14Z) - A Multi-In-Single-Out Network for Video Frame Interpolation without
Optical Flow [14.877766449009119]
deep learning-based video frame (VFI) methods have predominantly focused on estimating motion between two input frames.
We propose a multi-in-single-out (MISO) based VFI method that does not rely on motion vector estimation.
We introduce a novel motion perceptual loss that enables MISO-VFI to better capture the vectors-temporal within the video frames.
arXiv Detail & Related papers (2023-11-20T08:29:55Z) - Boost Video Frame Interpolation via Motion Adaptation [73.42573856943923]
Video frame (VFI) is a challenging task that aims to generate intermediate frames between two consecutive frames in a video.
Existing learning-based VFI methods have achieved great success, but they still suffer from limited generalization ability.
We propose a novel optimization-based VFI method that can adapt to unseen motions at test time.
arXiv Detail & Related papers (2023-06-24T10:44:02Z) - IDO-VFI: Identifying Dynamics via Optical Flow Guidance for Video Frame
Interpolation with Events [14.098949778274733]
Event cameras are ideal for capturing inter-frame dynamics with their extremely high temporal resolution.
We propose an event-and-frame-based video frame method named IDO-VFI that assigns varying amounts of computation for different sub-regions.
Our proposed method maintains high-quality performance while reducing computation time and computational effort by 10% and 17% respectively on Vimeo90K datasets.
arXiv Detail & Related papers (2023-05-17T13:22:21Z) - A Unified Framework for Event-based Frame Interpolation with Ad-hoc Deblurring in the Wild [72.0226493284814]
We propose a unified framework for event-based frame that performs deblurring ad-hoc.
Our network consistently outperforms previous state-of-the-art methods on frame, single image deblurring, and the joint task of both.
arXiv Detail & Related papers (2023-01-12T18:19:00Z) - E-VFIA : Event-Based Video Frame Interpolation with Attention [8.93294761619288]
We propose an event-based video frame with attention (E-VFIA) as a lightweight kernel-based method.
E-VFIA fuses event information with standard video frames by deformable convolutions to generate high quality interpolated frames.
The proposed method represents events with high temporal resolution and uses a multi-head self-attention mechanism to better encode event-based information.
arXiv Detail & Related papers (2022-09-19T21:40:32Z) - Video Frame Interpolation with Transformer [55.12620857638253]
We introduce a novel framework, which takes advantage of Transformer to model long-range pixel correlation among video frames.
Our network is equipped with a novel cross-scale window-based attention mechanism, where cross-scale windows interact with each other.
arXiv Detail & Related papers (2022-05-15T09:30:28Z) - FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation [97.99012124785177]
FLAVR is a flexible and efficient architecture that uses 3D space-time convolutions to enable end-to-end learning and inference for video framesupervised.
We demonstrate that FLAVR can serve as a useful self- pretext task for action recognition, optical flow estimation, and motion magnification.
arXiv Detail & Related papers (2020-12-15T18:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.