Dynamic Graph Induced Contour-aware Heat Conduction Network for Event-based Object Detection
- URL: http://arxiv.org/abs/2505.12908v1
- Date: Mon, 19 May 2025 09:44:01 GMT
- Title: Dynamic Graph Induced Contour-aware Heat Conduction Network for Event-based Object Detection
- Authors: Xiao Wang, Yu Jin, Lan Chen, Bo Jiang, Lin Zhu, Yonghong Tian, Jin Tang, Bin Luo,
- Abstract summary: Event-based Vision Sensors (EVS) have demonstrated significant advantages over traditional RGB frame-based cameras in low-light conditions.<n>This paper proposes a novel dynamic graph induced contour-aware heat conduction network for event stream based object detection.
- Score: 42.021851148914145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Event-based Vision Sensors (EVS) have demonstrated significant advantages over traditional RGB frame-based cameras in low-light conditions, high-speed motion capture, and low latency. Consequently, object detection based on EVS has attracted increasing attention from researchers. Current event stream object detection algorithms are typically built upon Convolutional Neural Networks (CNNs) or Transformers, which either capture limited local features using convolutional filters or incur high computational costs due to the utilization of self-attention. Recently proposed vision heat conduction backbone networks have shown a good balance between efficiency and accuracy; however, these models are not specifically designed for event stream data. They exhibit weak capability in modeling object contour information and fail to exploit the benefits of multi-scale features. To address these issues, this paper proposes a novel dynamic graph induced contour-aware heat conduction network for event stream based object detection, termed CvHeat-DET. The proposed model effectively leverages the clear contour information inherent in event streams to predict the thermal diffusivity coefficients within the heat conduction model, and integrates hierarchical structural graph features to enhance feature learning across multiple scales. Extensive experiments on three benchmark datasets for event stream-based object detection fully validated the effectiveness of the proposed model. The source code of this paper will be released on https://github.com/Event-AHU/OpenEvDET.
Related papers
- Event-based Graph Representation with Spatial and Motion Vectors for Asynchronous Object Detection [20.537672896807063]
Event-based sensors offer high temporal resolution and irregular latency.<n> converting this data into dense tensors for use in standard neural networks diminishes these inherent advantages.<n>We propose a novel multitemporal representation to better capture spatial structure and temporal changes.
arXiv Detail & Related papers (2025-07-20T23:02:23Z) - Hybrid Spiking Vision Transformer for Object Detection with Event Cameras [19.967565219584056]
Spiking Neural Networks (SNNs) have emerged as a promising approach, offering low energy consumption and rich dynamics.<n>This study proposes a novel hybrid Transformer (HsVT) model to enhance the performance of event-based object detection.<n> Experimental results demonstrate that HsVT achieves significant performance improvements in event detection with fewer parameters.
arXiv Detail & Related papers (2025-05-12T16:19:20Z) - GazeSCRNN: Event-based Near-eye Gaze Tracking using a Spiking Neural Network [0.0]
This work introduces GazeSCRNN, a novel convolutional recurrent neural network designed for event-based near-eye gaze tracking.<n>Model processes event streams from DVS cameras using Adaptive Leaky-Integrate-and-Fire (ALIF) neurons and a hybrid architecture for-temporal data.<n>The most accurate model achieved a Mean Angle Error (MAE) of 6.034degdeg and a Mean Pupil Error (MPE) of 2.094 mm.
arXiv Detail & Related papers (2025-03-20T10:32:15Z) - UCF-Crime-DVS: A Novel Event-Based Dataset for Video Anomaly Detection with Spiking Neural Networks [7.079697386550486]
Dynamic vision sensors (DVS) capture visual information as discrete events with a very high dynamic range and temporal resolution.<n>To introduce this rich dynamic information into the surveillance field, we created the first DVS video anomaly detection benchmark, UCF-Crime-DVS.<n>To fully utilize this new data modality, a multi-scale spiking fusion network (MSF) is designed based on spiking neural networks (SNNs)
arXiv Detail & Related papers (2025-03-17T08:11:26Z) - Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark Dataset [33.760645122753786]
This paper introduces a novel MoE (Mixture of Experts) heat conduction-based object detection algorithm.<n>We also introduce EvDET200K, a novel benchmark dataset for event-based object detection.
arXiv Detail & Related papers (2024-12-09T16:40:34Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Object-centric Cross-modal Feature Distillation for Event-based Object
Detection [87.50272918262361]
RGB detectors still outperform event-based detectors due to sparsity of the event data and missing visual details.
We develop a novel knowledge distillation approach to shrink the performance gap between these two modalities.
We show that object-centric distillation allows to significantly improve the performance of the event-based student object detector.
arXiv Detail & Related papers (2023-11-09T16:33:08Z) - Dual Memory Aggregation Network for Event-Based Object Detection with
Learnable Representation [79.02808071245634]
Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner.
Event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation.
Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars.
arXiv Detail & Related papers (2023-03-17T12:12:41Z) - ChiNet: Deep Recurrent Convolutional Learning for Multimodal Spacecraft
Pose Estimation [3.964047152162558]
This paper presents an innovative deep learning pipeline which estimates the relative pose of a spacecraft by incorporating the temporal information from a rendezvous sequence.
It leverages the performance of long short-term memory (LSTM) units in modelling sequences of data for the processing of features extracted by a convolutional neural network (CNN) backbone.
Three distinct training strategies, which follow a coarse-to-fine funnelled approach, are combined to facilitate feature learning and improve end-to-end pose estimation by regression.
arXiv Detail & Related papers (2021-08-23T16:48:58Z) - MFGNet: Dynamic Modality-Aware Filter Generation for RGB-T Tracking [72.65494220685525]
We propose a new dynamic modality-aware filter generation module (named MFGNet) to boost the message communication between visible and thermal data.
We generate dynamic modality-aware filters with two independent networks. The visible and thermal filters will be used to conduct a dynamic convolutional operation on their corresponding input feature maps respectively.
To address issues caused by heavy occlusion, fast motion, and out-of-view, we propose to conduct a joint local and global search by exploiting a new direction-aware target-driven attention mechanism.
arXiv Detail & Related papers (2021-07-22T03:10:51Z) - Learning Monocular Dense Depth from Events [53.078665310545745]
Event cameras produce brightness changes in the form of a stream of asynchronous events instead of intensity frames.
Recent learning-based approaches have been applied to event-based data, such as monocular depth prediction.
We propose a recurrent architecture to solve this task and show significant improvement over standard feed-forward methods.
arXiv Detail & Related papers (2020-10-16T12:36:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.