Event-to-Video Conversion for Overhead Object Detection
- URL: http://arxiv.org/abs/2402.06805v1
- Date: Fri, 9 Feb 2024 22:07:39 GMT
- Title: Event-to-Video Conversion for Overhead Object Detection
- Authors: Darryl Hannan, Ragib Arnab, Gavin Parpart, Garrett T. Kenyon, Edward
Kim, and Yijing Watkins
- Abstract summary: Event cameras complicate downstream image processing, especially for complex tasks such as object detection.
We show that there is a significant gap in performance between dense event representations and corresponding RGB frames.
We apply event-to-video conversion models that convert event streams into gray-scale video to close this gap.
- Score: 7.744259147081667
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Collecting overhead imagery using an event camera is desirable due to the
energy efficiency of the image sensor compared to standard cameras. However,
event cameras complicate downstream image processing, especially for complex
tasks such as object detection. In this paper, we investigate the viability of
event streams for overhead object detection. We demonstrate that across a
number of standard modeling approaches, there is a significant gap in
performance between dense event representations and corresponding RGB frames.
We establish that this gap is, in part, due to a lack of overlap between the
event representations and the pre-training data used to initialize the weights
of the object detectors. Then, we apply event-to-video conversion models that
convert event streams into gray-scale video to close this gap. We demonstrate
that this approach results in a large performance increase, outperforming even
event-specific object detection techniques on our overhead target task. These
results suggest that better alignment between event representations and
existing large pre-trained models may result in greater short-term performance
gains compared to end-to-end event-specific architectural improvements.
Related papers
- SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker.
SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z) - EventTransAct: A video transformer-based framework for Event-camera
based action recognition [52.537021302246664]
Event cameras offer new opportunities compared to standard action recognition in RGB videos.
In this study, we employ a computationally efficient model, namely the video transformer network (VTN), which initially acquires spatial embeddings per event-frame.
In order to better adopt the VTN for the sparse and fine-grained nature of event data, we design Event-Contrastive Loss ($mathcalL_EC$) and event-specific augmentations.
arXiv Detail & Related papers (2023-08-25T23:51:07Z) - Dual Memory Aggregation Network for Event-Based Object Detection with
Learnable Representation [79.02808071245634]
Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner.
Event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation.
Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars.
arXiv Detail & Related papers (2023-03-17T12:12:41Z) - Motion Robust High-Speed Light-Weighted Object Detection With Event
Camera [24.192961837270172]
We propose a motion robust and high-speed detection pipeline which better leverages the event data.
Experiments on two typical real-scene event camera object detection datasets show that our method is competitive in terms of accuracy, efficiency, and the number of parameters.
arXiv Detail & Related papers (2022-08-24T15:15:24Z) - Moving Object Detection for Event-based vision using Graph Spectral
Clustering [6.354824287948164]
Moving object detection has been a central topic of discussion in computer vision for its wide range of applications.
We present an unsupervised Graph Spectral Clustering technique for Moving Object Detection in Event-based data.
We additionally show how the optimum number of moving objects can be automatically determined.
arXiv Detail & Related papers (2021-09-30T10:19:22Z) - Bridging the Gap between Events and Frames through Unsupervised Domain
Adaptation [57.22705137545853]
We propose a task transfer method that allows models to be trained directly with labeled images and unlabeled event data.
We leverage the generative event model to split event features into content and motion features.
Our approach unlocks the vast amount of existing image datasets for the training of event-based neural networks.
arXiv Detail & Related papers (2021-09-06T17:31:37Z) - VisEvent: Reliable Object Tracking via Collaboration of Frame and Event
Flows [93.54888104118822]
We propose a large-scale Visible-Event benchmark (termed VisEvent) due to the lack of a realistic and scaled dataset for this task.
Our dataset consists of 820 video pairs captured under low illumination, high speed, and background clutter scenarios.
Based on VisEvent, we transform the event flows into event images and construct more than 30 baseline methods.
arXiv Detail & Related papers (2021-08-11T03:55:12Z) - Learning Monocular Dense Depth from Events [53.078665310545745]
Event cameras produce brightness changes in the form of a stream of asynchronous events instead of intensity frames.
Recent learning-based approaches have been applied to event-based data, such as monocular depth prediction.
We propose a recurrent architecture to solve this task and show significant improvement over standard feed-forward methods.
arXiv Detail & Related papers (2020-10-16T12:36:23Z) - Learning to Detect Objects with a 1 Megapixel Event Camera [14.949946376335305]
Event cameras encode visual information with high temporal precision, low data-rate, and high-dynamic range.
Due to the novelty of the field, the performance of event-based systems on many vision tasks is still lower compared to conventional frame-based solutions.
arXiv Detail & Related papers (2020-09-28T16:03:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.