Event-based YOLO Object Detection: Proof of Concept for Forward
Perception System
- URL: http://arxiv.org/abs/2212.07181v1
- Date: Wed, 14 Dec 2022 12:12:29 GMT
- Title: Event-based YOLO Object Detection: Proof of Concept for Forward
Perception System
- Authors: Waseem Shariff, Muhammad Ali Farooq, Joe Lemley and Peter Corcoran
- Abstract summary: This study focuses on leveraging neuromorphic event data for roadside object detection.
In this article, the event-simulated A2D2 dataset is manually annotated and trained on two different YOLOv5 networks.
- Score: 0.3058685580689604
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neuromorphic vision or event vision is an advanced vision technology, where
in contrast to the visible camera that outputs pixels, the event vision
generates neuromorphic events every time there is a brightness change which
exceeds a specific threshold in the field of view (FOV). This study focuses on
leveraging neuromorphic event data for roadside object detection. This is a
proof of concept towards building artificial intelligence (AI) based pipelines
which can be used for forward perception systems for advanced vehicular
applications. The focus is on building efficient state-of-the-art object
detection networks with better inference results for fast-moving forward
perception using an event camera. In this article, the event-simulated A2D2
dataset is manually annotated and trained on two different YOLOv5 networks
(small and large variants). To further assess its robustness, single model
testing and ensemble model testing are carried out.
Related papers
- Distractor-aware Event-based Tracking [45.07711356111249]
We propose a distractor-aware event-based tracker that introduces transformer modules into Siamese network architecture (named DANet)
Our model is mainly composed of a motion-aware network and a target-aware network, which simultaneously exploits both motion cues and object contours from event data.
Our DANet can be trained in an end-to-end manner without any post-processing and can run at over 80 FPS on a single V100.
arXiv Detail & Related papers (2023-10-22T05:50:20Z) - Geometric-aware Pretraining for Vision-centric 3D Object Detection [77.7979088689944]
We propose a novel geometric-aware pretraining framework called GAPretrain.
GAPretrain serves as a plug-and-play solution that can be flexibly applied to multiple state-of-the-art detectors.
We achieve 46.2 mAP and 55.5 NDS on the nuScenes val set using the BEVFormer method, with a gain of 2.7 and 2.1 points, respectively.
arXiv Detail & Related papers (2023-04-06T14:33:05Z) - Recurrent Vision Transformers for Object Detection with Event Cameras [62.27246562304705]
We present Recurrent Vision Transformers (RVTs), a novel backbone for object detection with event cameras.
RVTs can be trained from scratch to reach state-of-the-art performance on event-based object detection.
Our study brings new insights into effective design choices that can be fruitful for research beyond event-based vision.
arXiv Detail & Related papers (2022-12-11T20:28:59Z) - Recent Trends in 2D Object Detection and Applications in Video Event
Recognition [0.76146285961466]
We discuss the pioneering works in object detection, followed by the recent breakthroughs that employ deep learning.
We highlight recent datasets for 2D object detection both in images and videos, and present a comparative performance summary of various state-of-the-art object detection techniques.
arXiv Detail & Related papers (2022-02-07T14:15:11Z) - One-Shot Object Affordance Detection in the Wild [76.46484684007706]
Affordance detection refers to identifying the potential action possibilities of objects in an image.
We devise a One-Shot Affordance Detection Network (OSAD-Net) that estimates the human action purpose and then transfers it to help detect the common affordance from all candidate images.
With complex scenes and rich annotations, our PADv2 dataset can be used as a test bed to benchmark affordance detection methods.
arXiv Detail & Related papers (2021-08-08T14:53:10Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - Learning Monocular Dense Depth from Events [53.078665310545745]
Event cameras produce brightness changes in the form of a stream of asynchronous events instead of intensity frames.
Recent learning-based approaches have been applied to event-based data, such as monocular depth prediction.
We propose a recurrent architecture to solve this task and show significant improvement over standard feed-forward methods.
arXiv Detail & Related papers (2020-10-16T12:36:23Z) - Event-based Robotic Grasping Detection with Neuromorphic Vision Sensor
and Event-Stream Dataset [8.030163836902299]
Neuromorphic vision is a small and young community of research. Compared to traditional frame-based computer vision, neuromorphic vision is a small and young community of research.
We construct a robotic grasping dataset named Event-Stream dataset with 91 objects.
As leds blink at high frequency, the Event-Stream dataset is annotated in a high frequency of 1 kHz.
We develop a deep neural network for grasping detection which consider the angle learning problem as classification instead of regression.
arXiv Detail & Related papers (2020-04-28T16:55:19Z) - Traffic Signs Detection and Recognition System using Deep Learning [0.0]
This paper describes an approach for efficiently detecting and recognizing traffic signs in real-time.
We tackle the traffic sign detection problem using the state-of-the-art of multi-object detection systems.
The focus of this paper is going to be F-RCNN Inception v2 and Tiny YOLO v2 as they achieved the best results.
arXiv Detail & Related papers (2020-03-06T14:54:40Z) - Training-free Monocular 3D Event Detection System for Traffic
Surveillance [93.65240041833319]
Existing event detection systems are mostly learning-based and have achieved convincing performance when a large amount of training data is available.
In real-world scenarios, collecting sufficient labeled training data is expensive and sometimes impossible.
We propose a training-free monocular 3D event detection system for traffic surveillance.
arXiv Detail & Related papers (2020-02-01T04:42:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.