A Unified Framework for Event-based Frame Interpolation with Ad-hoc Deblurring in the Wild
- URL: http://arxiv.org/abs/2301.05191v2
- Date: Tue, 18 Feb 2025 17:08:41 GMT
- Title: A Unified Framework for Event-based Frame Interpolation with Ad-hoc Deblurring in the Wild
- Authors: Lei Sun, Daniel Gehrig, Christos Sakaridis, Mathias Gehrig, Jingyun Liang, Peng Sun, Zhijie Xu, Kaiwei Wang, Luc Van Gool, Davide Scaramuzza,
- Abstract summary: We propose a unified framework for event-based frame that performs deblurring ad-hoc.
Our network consistently outperforms previous state-of-the-art methods on frame, single image deblurring, and the joint task of both.
- Score: 72.0226493284814
- License:
- Abstract: Effective video frame interpolation hinges on the adept handling of motion in the input scene. Prior work acknowledges asynchronous event information for this, but often overlooks whether motion induces blur in the video, limiting its scope to sharp frame interpolation. We instead propose a unified framework for event-based frame interpolation that performs deblurring ad-hoc and thus works both on sharp and blurry input videos. Our model consists in a bidirectional recurrent network that incorporates the temporal dimension of interpolation and fuses information from the input frames and the events adaptively based on their temporal proximity. To enhance the generalization from synthetic data to real event cameras, we integrate self-supervised framework with the proposed model to enhance the generalization on real-world datasets in the wild. At the dataset level, we introduce a novel real-world high-resolution dataset with events and color videos named HighREV, which provides a challenging evaluation setting for the examined task. Extensive experiments show that our network consistently outperforms previous state-of-the-art methods on frame interpolation, single image deblurring, and the joint task of both. Experiments on domain transfer reveal that self-supervised training effectively mitigates the performance degradation observed when transitioning from synthetic data to real-world data. Code and datasets are available at https://github.com/AHupuJR/REFID.
Related papers
- Event-Based Video Frame Interpolation With Cross-Modal Asymmetric Bidirectional Motion Fields [39.214857326425204]
Video Frame Interpolation (VFI) aims to generate intermediate video frames between consecutive input frames.
We propose a novel event-based VFI framework with cross-modal asymmetric bidirectional motion field estimation.
Our method shows significant performance improvement over the state-of-the-art VFI methods on various datasets.
arXiv Detail & Related papers (2025-02-19T13:40:43Z) - Repurposing Pre-trained Video Diffusion Models for Event-based Video Interpolation [20.689304579898728]
Event-based Video Frame Interpolation (EVFI) uses sparse, high-temporal-resolution event measurements as motion guidance.
We adapt pre-trained video diffusion models trained on internet-scale datasets to EVFI.
Our method outperforms existing methods and generalizes across cameras far better than existing approaches.
arXiv Detail & Related papers (2024-12-10T18:55:30Z) - CMTA: Cross-Modal Temporal Alignment for Event-guided Video Deblurring [44.30048301161034]
Video deblurring aims to enhance the quality of restored results in motion-red videos by gathering information from adjacent video frames.
We propose two modules: 1) Intra-frame feature enhancement operates within the exposure time of a single blurred frame, and 2) Inter-frame temporal feature alignment gathers valuable long-range temporal information to target frames.
We demonstrate that our proposed methods outperform state-of-the-art frame-based and event-based motion deblurring methods through extensive experiments conducted on both synthetic and real-world deblurring datasets.
arXiv Detail & Related papers (2024-08-27T10:09:17Z) - Event-based Video Frame Interpolation with Edge Guided Motion Refinement [28.331148083668857]
We introduce an end-to-end E-VFI learning method to efficiently utilize edge features from event signals for motion flow and warping enhancement.
Our method incorporates an Edge Guided Attentive (EGA) module, which rectifies estimated video motion through attentive aggregation.
Experiments on both synthetic and real datasets show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-04-28T12:13:34Z) - Revisiting Event-based Video Frame Interpolation [49.27404719898305]
Dynamic vision sensors or event cameras provide rich complementary information for video frame.
estimating optical flow from events is arguably more difficult than from RGB information.
We propose a divide-and-conquer strategy in which event-based intermediate frame synthesis happens incrementally in multiple simplified stages.
arXiv Detail & Related papers (2023-07-24T06:51:07Z) - Blur Interpolation Transformer for Real-World Motion from Blur [52.10523711510876]
We propose a encoded blur transformer (BiT) to unravel the underlying temporal correlation in blur.
Based on multi-scale residual Swin transformer blocks, we introduce dual-end temporal supervision and temporally symmetric ensembling strategies.
In addition, we design a hybrid camera system to collect the first real-world dataset of one-to-many blur-sharp video pairs.
arXiv Detail & Related papers (2022-11-21T13:10:10Z) - Exploring Intra- and Inter-Video Relation for Surgical Semantic Scene
Segmentation [58.74791043631219]
We propose a novel framework STswinCL that explores the complementary intra- and inter-video relations to boost segmentation performance.
We extensively validate our approach on two public surgical video benchmarks, including EndoVis18 Challenge and CaDIS dataset.
Experimental results demonstrate the promising performance of our method, which consistently exceeds previous state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-29T05:52:23Z) - Unifying Motion Deblurring and Frame Interpolation with Events [11.173687810873433]
Slow shutter speed and long exposure time of frame-based cameras often cause visual blur and loss of inter-frame information, degenerating the overall quality of captured videos.
We present a unified framework of event-based motion deblurring and frame enhancement for blurry video enhancement, where the extremely low latency of events is leveraged to alleviate motion blur and facilitate intermediate frame prediction.
By exploring the mutual constraints among blurry frames, latent images, and event streams, we further propose a self-supervised learning framework to enable network training with real-world blurry videos and events.
arXiv Detail & Related papers (2022-03-23T03:43:12Z) - Wide and Narrow: Video Prediction from Context and Motion [54.21624227408727]
We propose a new framework to integrate these complementary attributes to predict complex pixel dynamics through deep networks.
We present global context propagation networks that aggregate the non-local neighboring representations to preserve the contextual information over the past frames.
We also devise local filter memory networks that generate adaptive filter kernels by storing the motion of moving objects in the memory.
arXiv Detail & Related papers (2021-10-22T04:35:58Z) - TimeLens: Event-based Video Frame Interpolation [54.28139783383213]
We introduce Time Lens, a novel indicates equal contribution method that leverages the advantages of both synthesis-based and flow-based approaches.
We show an up to 5.21 dB improvement in terms of PSNR over state-of-the-art frame-based and event-based methods.
arXiv Detail & Related papers (2021-06-14T10:33:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.