SuperEvent: Cross-Modal Learning of Event-based Keypoint Detection
- URL: http://arxiv.org/abs/2504.00139v1
- Date: Mon, 31 Mar 2025 18:46:02 GMT
- Title: SuperEvent: Cross-Modal Learning of Event-based Keypoint Detection
- Authors: Yannick Burkhardt, Simon Schaefer, Stefan Leutenegger,
- Abstract summary: SuperEvent is a data-driven approach to predict stable keypoints with expressive descriptors.<n>We demonstrate the usefulness of SuperEvent by its integration into a modern sparse keypoint and descriptor-based SLAM framework.
- Score: 13.42760841894735
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Event-based keypoint detection and matching holds significant potential, enabling the integration of event sensors into highly optimized Visual SLAM systems developed for frame cameras over decades of research. Unfortunately, existing approaches struggle with the motion-dependent appearance of keypoints and the complex noise prevalent in event streams, resulting in severely limited feature matching capabilities and poor performance on downstream tasks. To mitigate this problem, we propose SuperEvent, a data-driven approach to predict stable keypoints with expressive descriptors. Due to the absence of event datasets with ground truth keypoint labels, we leverage existing frame-based keypoint detectors on readily available event-aligned and synchronized gray-scale frames for self-supervision: we generate temporally sparse keypoint pseudo-labels considering that events are a product of both scene appearance and camera motion. Combined with our novel, information-rich event representation, we enable SuperEvent to effectively learn robust keypoint detection and description in event streams. Finally, we demonstrate the usefulness of SuperEvent by its integration into a modern sparse keypoint and descriptor-based SLAM framework originally developed for traditional cameras, surpassing the state-of-the-art in event-based SLAM by a wide margin. Source code and multimedia material are available at smartroboticslab.github.io/SuperEvent.
Related papers
- SuperEIO: Self-Supervised Event Feature Learning for Event Inertial Odometry [6.552812892993662]
Event cameras asynchronously output low-latency event streams, promising for state estimation in high-speed motion and challenging lighting conditions.<n>We propose SuperEIO, a novel framework that leverages the learning-based event-only detection and IMU measurements to achieve eventinertial odometry.<n>We evaluate our method extensively on multiple public datasets, demonstrating its superior accuracy and robustness compared to other state-of-the-art event-based methods.
arXiv Detail & Related papers (2025-03-29T03:58:15Z) - EventCrab: Harnessing Frame and Point Synergy for Event-based Action Recognition and Beyond [61.10181853363728]
Event-based Action Recognition (EAR) possesses the advantages of high-temporal and privacy preservation compared with traditional action recognition.<n>We present EventCrab, a framework that adeptly integrates the "lighter" frame-specific networks for dense event frames with the "heavier" point-specific networks for sparse event points.<n>Experiments on four datasets demonstrate the significant performance of our proposed EventCrab.
arXiv Detail & Related papers (2024-11-27T13:28:57Z) - Dynamic Subframe Splitting and Spatio-Temporal Motion Entangled Sparse Attention for RGB-E Tracking [32.86991031493605]
Event-based bionic camera captures dynamic scenes with high temporal resolution and high dynamic range.
We propose a dynamic event subframe splitting strategy to split the event stream into more fine-grained event clusters.
Based on this, we design an event-based sparse attention mechanism to enhance the interaction of event features in temporal and spatial dimensions.
arXiv Detail & Related papers (2024-09-26T06:12:08Z) - Event-to-Video Conversion for Overhead Object Detection [7.744259147081667]
Event cameras complicate downstream image processing, especially for complex tasks such as object detection.
We show that there is a significant gap in performance between dense event representations and corresponding RGB frames.
We apply event-to-video conversion models that convert event streams into gray-scale video to close this gap.
arXiv Detail & Related papers (2024-02-09T22:07:39Z) - SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker.
SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z) - Event-based Simultaneous Localization and Mapping: A Comprehensive Survey [52.73728442921428]
Review of event-based vSLAM algorithms that exploit the benefits of asynchronous and irregular event streams for localization and mapping tasks.
Paper categorizes event-based vSLAM methods into four main categories: feature-based, direct, motion-compensation, and deep learning methods.
arXiv Detail & Related papers (2023-04-19T16:21:14Z) - Dual Memory Aggregation Network for Event-Based Object Detection with
Learnable Representation [79.02808071245634]
Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner.
Event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation.
Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars.
arXiv Detail & Related papers (2023-03-17T12:12:41Z) - Avoiding Post-Processing with Event-Based Detection in Biomedical
Signals [69.34035527763916]
We propose an event-based modeling framework that directly works with events as learning targets.
We show that event-based modeling (without post-processing) performs on par with or better than epoch-based modeling with extensive post-processing.
arXiv Detail & Related papers (2022-09-22T13:44:13Z) - Event Transformer [43.193463048148374]
Event camera's low power consumption and ability to capture microsecond brightness make it attractive for various computer vision tasks.
Existing event representation methods typically convert events into frames, voxel grids, or spikes for deep neural networks (DNNs)
This work introduces a novel token-based event representation, where each event is considered a fundamental processing unit termed an event-token.
arXiv Detail & Related papers (2022-04-11T15:05:06Z) - Learning Constraints and Descriptive Segmentation for Subevent Detection [74.48201657623218]
We propose an approach to learning and enforcing constraints that capture dependencies between subevent detection and EventSeg prediction.
We adopt Rectifier Networks for constraint learning and then convert the learned constraints to a regularization term in the loss function of the neural model.
arXiv Detail & Related papers (2021-09-13T20:50:37Z) - VisEvent: Reliable Object Tracking via Collaboration of Frame and Event
Flows [93.54888104118822]
We propose a large-scale Visible-Event benchmark (termed VisEvent) due to the lack of a realistic and scaled dataset for this task.
Our dataset consists of 820 video pairs captured under low illumination, high speed, and background clutter scenarios.
Based on VisEvent, we transform the event flows into event images and construct more than 30 baseline methods.
arXiv Detail & Related papers (2021-08-11T03:55:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.