Frame-Event Alignment and Fusion Network for High Frame Rate Tracking
- URL: http://arxiv.org/abs/2305.15688v1
- Date: Thu, 25 May 2023 03:34:24 GMT
- Title: Frame-Event Alignment and Fusion Network for High Frame Rate Tracking
- Authors: Jiqing Zhang, Yuanchen Wang, Wenxi Liu, Meng Li, Jinpeng Bai, Baocai
Yin, Xin Yang
- Abstract summary: Most existing RGB-based trackers target low frame rate benchmarks of around 30 frames per second.
We propose an end-to-end network consisting of multi-modality alignment and fusion modules.
With the FE240hz dataset, our approach achieves high frame rate tracking up to 240Hz.
- Score: 37.35823883499189
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most existing RGB-based trackers target low frame rate benchmarks of around
30 frames per second. This setting restricts the tracker's functionality in the
real world, especially for fast motion. Event-based cameras as bioinspired
sensors provide considerable potential for high frame rate tracking due to
their high temporal resolution. However, event-based cameras cannot offer
fine-grained texture information like conventional cameras. This unique
complementarity motivates us to combine conventional frames and events for high
frame rate object tracking under various challenging conditions. Inthispaper,
we propose an end-to-end network consisting of multi-modality alignment and
fusion modules to effectively combine meaningful information from both
modalities at different measurement rates. The alignment module is responsible
for cross-style and cross-frame-rate alignment between frame and event
modalities under the guidance of the moving cues furnished by events. While the
fusion module is accountable for emphasizing valuable features and suppressing
noise information by the mutual complement between the two modalities.
Extensive experiments show that the proposed approach outperforms
state-of-the-art trackers by a significant margin in high frame rate tracking.
With the FE240hz dataset, our approach achieves high frame rate tracking up to
240Hz.
Related papers
- BlinkTrack: Feature Tracking over 100 FPS via Events and Images [50.98675227695814]
We propose a novel framework, BlinkTrack, which integrates event data with RGB images for high-frequency feature tracking.
Our method extends the traditional Kalman filter into a learning-based framework, utilizing differentiable Kalman filters in both event and image branches.
Experimental results indicate that BlinkTrack significantly outperforms existing event-based methods.
arXiv Detail & Related papers (2024-09-26T15:54:18Z) - Tracking Any Point with Frame-Event Fusion Network at High Frame Rate [16.749590397918574]
We propose an image-event fusion point tracker, FE-TAP.
It combines the contextual information from image frames with the high temporal resolution of events.
FE-TAP achieves high frame rate and robust point tracking under various challenging conditions.
arXiv Detail & Related papers (2024-09-18T13:07:19Z) - CMTA: Cross-Modal Temporal Alignment for Event-guided Video Deblurring [44.30048301161034]
Video deblurring aims to enhance the quality of restored results in motion-red videos by gathering information from adjacent video frames.
We propose two modules: 1) Intra-frame feature enhancement operates within the exposure time of a single blurred frame, and 2) Inter-frame temporal feature alignment gathers valuable long-range temporal information to target frames.
We demonstrate that our proposed methods outperform state-of-the-art frame-based and event-based motion deblurring methods through extensive experiments conducted on both synthetic and real-world deblurring datasets.
arXiv Detail & Related papers (2024-08-27T10:09:17Z) - CRSOT: Cross-Resolution Object Tracking using Unaligned Frame and Event
Cameras [43.699819213559515]
Existing datasets for RGB-DVS tracking are collected with DVS346 camera and their resolution ($346 times 260$) is low for practical applications.
We build the first unaligned frame-event dataset CRSOT collected with a specially built data acquisition system.
We propose a novel unaligned object tracking framework that can realize robust tracking even using the loosely aligned RGB-Event data.
arXiv Detail & Related papers (2024-01-05T14:20:22Z) - SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker.
SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z) - Self-supervised Learning of Event-guided Video Frame Interpolation for
Rolling Shutter Frames [6.62974666987451]
This paper makes the first attempt to tackle the challenging task of recovering arbitrary frame rate latent global shutter (GS) frames from two consecutive rolling shutter (RS) frames.
We propose a novel self-supervised framework that leverages events to guide RS frame correction VFI in a unified framework.
arXiv Detail & Related papers (2023-06-27T14:30:25Z) - Towards Frame Rate Agnostic Multi-Object Tracking [76.82407173177138]
We propose a Frame Rate Agnostic MOT framework with a Periodic training Scheme (FAPS) to tackle the FraMOT problem for the first time.
Specifically, we propose a Frame Rate Agnostic Association Module (FAAM) that infers and encodes the frame rate information.
FAPS reflects all post-processing steps in training via tracking pattern matching and fusion.
arXiv Detail & Related papers (2022-09-23T04:25:19Z) - VisEvent: Reliable Object Tracking via Collaboration of Frame and Event
Flows [93.54888104118822]
We propose a large-scale Visible-Event benchmark (termed VisEvent) due to the lack of a realistic and scaled dataset for this task.
Our dataset consists of 820 video pairs captured under low illumination, high speed, and background clutter scenarios.
Based on VisEvent, we transform the event flows into event images and construct more than 30 baseline methods.
arXiv Detail & Related papers (2021-08-11T03:55:12Z) - TimeLens: Event-based Video Frame Interpolation [54.28139783383213]
We introduce Time Lens, a novel indicates equal contribution method that leverages the advantages of both synthesis-based and flow-based approaches.
We show an up to 5.21 dB improvement in terms of PSNR over state-of-the-art frame-based and event-based methods.
arXiv Detail & Related papers (2021-06-14T10:33:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.