Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark Dataset
- URL: http://arxiv.org/abs/2412.06647v1
- Date: Mon, 09 Dec 2024 16:40:34 GMT
- Title: Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark Dataset
- Authors: Xiao Wang, Yu Jin, Wentao Wu, Wei Zhang, Lin Zhu, Bo Jiang, Yonghong Tian,
- Abstract summary: This paper introduces a novel MoE (Mixture of Experts) heat conduction-based object detection algorithm.<n>We also introduce EvDET200K, a novel benchmark dataset for event-based object detection.
- Score: 33.760645122753786
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object detection in event streams has emerged as a cutting-edge research area, demonstrating superior performance in low-light conditions, scenarios with motion blur, and rapid movements. Current detectors leverage spiking neural networks, Transformers, or convolutional neural networks as their core architectures, each with its own set of limitations including restricted performance, high computational overhead, or limited local receptive fields. This paper introduces a novel MoE (Mixture of Experts) heat conduction-based object detection algorithm that strikingly balances accuracy and computational efficiency. Initially, we employ a stem network for event data embedding, followed by processing through our innovative MoE-HCO blocks. Each block integrates various expert modules to mimic heat conduction within event streams. Subsequently, an IoU-based query selection module is utilized for efficient token extraction, which is then channeled into a detection head for the final object detection process. Furthermore, we are pleased to introduce EvDET200K, a novel benchmark dataset for event-based object detection. Captured with a high-definition Prophesee EVK4-HD event camera, this dataset encompasses 10 distinct categories, 200,000 bounding boxes, and 10,054 samples, each spanning 2 to 5 seconds. We also provide comprehensive results from over 15 state-of-the-art detectors, offering a solid foundation for future research and comparison. The source code of this paper will be released on: https://github.com/Event-AHU/OpenEvDET
Related papers
- Sparse Convolutional Recurrent Learning for Efficient Event-based Neuromorphic Object Detection [4.362139927929203]
We propose the Sparse Event-based Efficient Detector (SEED) for efficient event-based object detection on neuromorphic processors.<n>We introduce sparse convolutional recurrent learning, which achieves over 92% activation sparsity in recurrent processing, vastly reducing the cost for reasoning on sparse event data.
arXiv Detail & Related papers (2025-06-16T12:54:27Z) - Dynamic Graph Induced Contour-aware Heat Conduction Network for Event-based Object Detection [42.021851148914145]
Event-based Vision Sensors (EVS) have demonstrated significant advantages over traditional RGB frame-based cameras in low-light conditions.<n>This paper proposes a novel dynamic graph induced contour-aware heat conduction network for event stream based object detection.
arXiv Detail & Related papers (2025-05-19T09:44:01Z) - Asynchronous Multi-Object Tracking with an Event Camera [4.001017064909953]
We present the Asynchronous Event Multi-Object Tracking (AEMOT) algorithm for detecting and tracking multiple objects by processing individual raw events asynchronously.<n>We evaluate AEMOT on a new Bee Swarm dataset, where it tracks dozens of small bees with precision and recall performance exceeding that of alternative event-based detection and tracking algorithms by over 37%.
arXiv Detail & Related papers (2025-05-12T23:53:08Z) - SuperEIO: Self-Supervised Event Feature Learning for Event Inertial Odometry [6.552812892993662]
Event cameras asynchronously output low-latency event streams, promising for state estimation in high-speed motion and challenging lighting conditions.
We propose SuperEIO, a novel framework that leverages the learning-based event-only detection and IMU measurements to achieve eventinertial odometry.
We evaluate our method extensively on multiple public datasets, demonstrating its superior accuracy and robustness compared to other state-of-the-art event-based methods.
arXiv Detail & Related papers (2025-03-29T03:58:15Z) - Practical Video Object Detection via Feature Selection and Aggregation [18.15061460125668]
Video object detection (VOD) needs to concern the high across-frame variation in object appearance, and the diverse deterioration in some frames.
Most of contemporary aggregation methods are tailored for two-stage detectors, suffering from high computational costs.
This study invents a very simple yet potent strategy of feature selection and aggregation, gaining significant accuracy at marginal computational expense.
arXiv Detail & Related papers (2024-07-29T02:12:11Z) - DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition [51.96660522869841]
DailyDVS-200 is a benchmark dataset tailored for the event-based action recognition community.
It covers 200 action categories across real-world scenarios, recorded by 47 participants, and comprises more than 22,000 event sequences.
DailyDVS-200 is annotated with 14 attributes, ensuring a detailed characterization of the recorded actions.
arXiv Detail & Related papers (2024-07-06T15:25:10Z) - SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker.
SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z) - EventTransAct: A video transformer-based framework for Event-camera
based action recognition [52.537021302246664]
Event cameras offer new opportunities compared to standard action recognition in RGB videos.
In this study, we employ a computationally efficient model, namely the video transformer network (VTN), which initially acquires spatial embeddings per event-frame.
In order to better adopt the VTN for the sparse and fine-grained nature of event data, we design Event-Contrastive Loss ($mathcalL_EC$) and event-specific augmentations.
arXiv Detail & Related papers (2023-08-25T23:51:07Z) - Dual Memory Aggregation Network for Event-Based Object Detection with
Learnable Representation [79.02808071245634]
Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner.
Event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation.
Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars.
arXiv Detail & Related papers (2023-03-17T12:12:41Z) - Motion Robust High-Speed Light-Weighted Object Detection With Event
Camera [24.192961837270172]
We propose a motion robust and high-speed detection pipeline which better leverages the event data.
Experiments on two typical real-scene event camera object detection datasets show that our method is competitive in terms of accuracy, efficiency, and the number of parameters.
arXiv Detail & Related papers (2022-08-24T15:15:24Z) - EAutoDet: Efficient Architecture Search for Object Detection [110.99532343155073]
EAutoDet framework can discover practical backbone and FPN architectures for object detection in 1.4 GPU-days.
We propose a kernel reusing technique by sharing the weights of candidate operations on one edge and consolidating them into one convolution.
In particular, the discovered architectures surpass state-of-the-art object detection NAS methods and achieve 40.1 mAP with 120 FPS and 49.2 mAP with 41.3 FPS on COCO test-dev set.
arXiv Detail & Related papers (2022-03-21T05:56:12Z) - Motion Vector Extrapolation for Video Object Detection [0.0]
MOVEX enables low latency video object detection on common CPU based systems.
We show that our approach significantly reduces the baseline latency of any given object detector.
Further latency reduction, up to 25x lower than the original latency, can be achieved with minimal accuracy loss.
arXiv Detail & Related papers (2021-04-18T17:26:37Z) - EBBINNOT: A Hardware Efficient Hybrid Event-Frame Tracker for Stationary
Dynamic Vision Sensors [5.674895233111088]
This paper presents a hybrid event-frame approach for detecting and tracking objects recorded by a stationary neuromorphic sensor.
To exploit the background removal property of a static DVS, we propose an event-based binary image creation that signals presence or absence of events in a frame duration.
This is the first time a stationary DVS based traffic monitoring solution is extensively compared to simultaneously recorded RGB frame-based methods.
arXiv Detail & Related papers (2020-05-31T03:01:35Z) - Dynamic Refinement Network for Oriented and Densely Packed Object
Detection [75.29088991850958]
We present a dynamic refinement network that consists of two novel components, i.e., a feature selection module (FSM) and a dynamic refinement head (DRH)
Our FSM enables neurons to adjust receptive fields in accordance with the shapes and orientations of target objects, whereas the DRH empowers our model to refine the prediction dynamically in an object-aware manner.
We perform quantitative evaluations on several publicly available benchmarks including DOTA, HRSC2016, SKU110K, and our own SKU110K-R dataset.
arXiv Detail & Related papers (2020-05-20T11:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.