Video Frame Interpolation with Stereo Event and Intensity Camera
- URL: http://arxiv.org/abs/2307.08228v1
- Date: Mon, 17 Jul 2023 04:02:00 GMT
- Title: Video Frame Interpolation with Stereo Event and Intensity Camera
- Authors: Chao Ding, Mingyuan Lin, Haijian Zhang, Jianzhuang Liu, Lei Yu
- Abstract summary: We propose a novel Stereo Event-based VFI network (SE-VFI-Net) to generate high-quality intermediate frames.
We exploit the fused features accomplishing accurate optical flow and disparity estimation.
Our proposed SEVFI-Net outperforms state-of-the-art methods by a large margin.
- Score: 40.07341828127157
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The stereo event-intensity camera setup is widely applied to leverage the
advantages of both event cameras with low latency and intensity cameras that
capture accurate brightness and texture information. However, such a setup
commonly encounters cross-modality parallax that is difficult to be eliminated
solely with stereo rectification especially for real-world scenes with complex
motions and varying depths, posing artifacts and distortion for existing
Event-based Video Frame Interpolation (E-VFI) approaches. To tackle this
problem, we propose a novel Stereo Event-based VFI (SE-VFI) network (SEVFI-Net)
to generate high-quality intermediate frames and corresponding disparities from
misaligned inputs consisting of two consecutive keyframes and event streams
emitted between them. Specifically, we propose a Feature Aggregation Module
(FAM) to alleviate the parallax and achieve spatial alignment in the feature
domain. We then exploit the fused features accomplishing accurate optical flow
and disparity estimation, and achieving better interpolated results through
flow-based and synthesis-based ways. We also build a stereo visual acquisition
system composed of an event camera and an RGB-D camera to collect a new Stereo
Event-Intensity Dataset (SEID) containing diverse scenes with complex motions
and varying depths. Experiments on public real-world stereo datasets, i.e.,
DSEC and MVSEC, and our SEID dataset demonstrate that our proposed SEVFI-Net
outperforms state-of-the-art methods by a large margin.
Related papers
- CoSEC: A Coaxial Stereo Event Camera Dataset for Autonomous Driving [15.611896480837316]
Event camera with high dynamic range has been applied in assisting frame camera for the multimodal fusion.
We propose a coaxial stereo event camera (CoSEC) dataset for autonomous driving.
arXiv Detail & Related papers (2024-08-16T02:55:10Z) - EventAid: Benchmarking Event-aided Image/Video Enhancement Algorithms
with Real-captured Hybrid Dataset [55.12137324648253]
Event cameras are emerging imaging technology that offers advantages over conventional frame-based imaging sensors in dynamic range and sensing speed.
This paper focuses on five event-aided image and video enhancement tasks.
arXiv Detail & Related papers (2023-12-13T15:42:04Z) - Learning Parallax for Stereo Event-based Motion Deblurring [8.201943408103995]
Existing approaches rely on the perfect pixel-wise alignment between intensity images and events, which is not always fulfilled in the real world.
We propose a novel coarse-to-fine framework, named NETwork of Event-based motion Deblurring with STereo event and intensity cameras (St-EDNet)
We build a new dataset with STereo Event and Intensity Cameras (StEIC), containing real-world events, intensity images, and dense disparity maps.
arXiv Detail & Related papers (2023-09-18T06:51:41Z) - Revisiting Event-based Video Frame Interpolation [49.27404719898305]
Dynamic vision sensors or event cameras provide rich complementary information for video frame.
estimating optical flow from events is arguably more difficult than from RGB information.
We propose a divide-and-conquer strategy in which event-based intermediate frame synthesis happens incrementally in multiple simplified stages.
arXiv Detail & Related papers (2023-07-24T06:51:07Z) - Alignment-free HDR Deghosting with Semantics Consistent Transformer [76.91669741684173]
High dynamic range imaging aims to retrieve information from multiple low-dynamic range inputs to generate realistic output.
Existing methods often focus on the spatial misalignment across input frames caused by the foreground and/or camera motion.
We propose a novel alignment-free network with a Semantics Consistent Transformer (SCTNet) with both spatial and channel attention modules.
arXiv Detail & Related papers (2023-05-29T15:03:23Z) - Joint Video Multi-Frame Interpolation and Deblurring under Unknown
Exposure Time [101.91824315554682]
In this work, we aim ambitiously for a more realistic and challenging task - joint video multi-frame and deblurring under unknown exposure time.
We first adopt a variant of supervised contrastive learning to construct an exposure-aware representation from input blurred frames.
We then build our video reconstruction network upon the exposure and motion representation by progressive exposure-adaptive convolution and motion refinement.
arXiv Detail & Related papers (2023-03-27T09:43:42Z) - Event-Based Frame Interpolation with Ad-hoc Deblurring [68.97825675372354]
We propose a general method for event-based frame that performs deblurring ad-hoc on input videos.
Our network consistently outperforms state-of-the-art methods on frame, single image deblurring and the joint task of deblurring.
Our code and dataset will be made publicly available.
arXiv Detail & Related papers (2023-01-12T18:19:00Z) - Self-Supervised Intensity-Event Stereo Matching [24.851819610561517]
Event cameras are novel bio-inspired vision sensors that output pixel-level intensity changes in microsecond accuracy.
Event cameras cannot be directly applied to computational imaging tasks due to the inability to obtain high-quality intensity and events simultaneously.
This paper aims to connect a standalone event camera and a modern intensity camera so that the applications can take advantage of both two sensors.
arXiv Detail & Related papers (2022-11-01T14:52:25Z) - Stereo Hybrid Event-Frame (SHEF) Cameras for 3D Perception [17.585862399941544]
Event cameras address limitations as they report brightness changes of each pixel independently with a fine temporal resolution.
integrated hybrid event-frame sensors (eg., DAVIS) are available, but the quality of data is compromised by coupling at the pixel level in the circuit fabrication of such cameras.
This paper proposes a stereo hybrid event-frame (SHEF) camera system that offers a sensor modality with separate high-quality pure event and pure frame cameras.
arXiv Detail & Related papers (2021-10-11T04:03:36Z) - Event-based Stereo Visual Odometry [42.77238738150496]
We present a solution to the problem of visual odometry from the data acquired by a stereo event-based camera rig.
We seek to maximize thetemporal consistency of stereo event-based data while using a simple and efficient representation.
arXiv Detail & Related papers (2020-07-30T15:53:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.