Event-based Robotic Grasping Detection with Neuromorphic Vision Sensor
and Event-Stream Dataset
- URL: http://arxiv.org/abs/2004.13652v2
- Date: Fri, 1 May 2020 16:59:14 GMT
- Title: Event-based Robotic Grasping Detection with Neuromorphic Vision Sensor
and Event-Stream Dataset
- Authors: Bin Li, Hu Cao, Zhongnan Qu, Yingbai Hu, Zhenke Wang, and Zichen Liang
- Abstract summary: Neuromorphic vision is a small and young community of research. Compared to traditional frame-based computer vision, neuromorphic vision is a small and young community of research.
We construct a robotic grasping dataset named Event-Stream dataset with 91 objects.
As leds blink at high frequency, the Event-Stream dataset is annotated in a high frequency of 1 kHz.
We develop a deep neural network for grasping detection which consider the angle learning problem as classification instead of regression.
- Score: 8.030163836902299
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robotic grasping plays an important role in the field of robotics. The
current state-of-the-art robotic grasping detection systems are usually built
on the conventional vision, such as RGB-D camera. Compared to traditional
frame-based computer vision, neuromorphic vision is a small and young community
of research. Currently, there are limited event-based datasets due to the
troublesome annotation of the asynchronous event stream. Annotating large scale
vision dataset often takes lots of computation resources, especially the
troublesome data for video-level annotation. In this work, we consider the
problem of detecting robotic grasps in a moving camera view of a scene
containing objects. To obtain more agile robotic perception, a neuromorphic
vision sensor (DAVIS) attaching to the robot gripper is introduced to explore
the potential usage in grasping detection. We construct a robotic grasping
dataset named Event-Stream Dataset with 91 objects. A spatio-temporal mixed
particle filter (SMP Filter) is proposed to track the led-based grasp
rectangles which enables video-level annotation of a single grasp rectangle per
object. As leds blink at high frequency, the Event-Stream dataset is annotated
in a high frequency of 1 kHz. Based on the Event-Stream dataset, we develop a
deep neural network for grasping detection which consider the angle learning
problem as classification instead of regression. The method performs high
detection accuracy on our Event-Stream dataset with 93% precision at
object-wise level. This work provides a large-scale and well-annotated dataset,
and promotes the neuromorphic vision applications in agile robot.
Related papers
- Spatio-temporal Transformers for Action Unit Classification with Event Cameras [28.98336123799572]
We present FACEMORPHIC, a temporally synchronized multimodal face dataset composed of RGB videos and event streams.
We show how temporal synchronization can allow effective neuromorphic face analysis without the need to manually annotate videos.
arXiv Detail & Related papers (2024-10-29T11:23:09Z) - EventTransAct: A video transformer-based framework for Event-camera
based action recognition [52.537021302246664]
Event cameras offer new opportunities compared to standard action recognition in RGB videos.
In this study, we employ a computationally efficient model, namely the video transformer network (VTN), which initially acquires spatial embeddings per event-frame.
In order to better adopt the VTN for the sparse and fine-grained nature of event data, we design Event-Contrastive Loss ($mathcalL_EC$) and event-specific augmentations.
arXiv Detail & Related papers (2023-08-25T23:51:07Z) - HabitatDyn Dataset: Dynamic Object Detection to Kinematics Estimation [16.36110033895749]
We propose the dataset HabitatDyn, which contains both synthetic RGB videos, semantic labels, and depth information, as well as kinetics information.
HabitatDyn was created from the perspective of a mobile robot with a moving camera, and contains 30 scenes featuring six different types of moving objects with varying velocities.
arXiv Detail & Related papers (2023-04-21T09:57:35Z) - EV-Catcher: High-Speed Object Catching Using Low-latency Event-based
Neural Networks [107.62975594230687]
We demonstrate an application where event cameras excel: accurately estimating the impact location of fast-moving objects.
We introduce a lightweight event representation called Binary Event History Image (BEHI) to encode event data at low latency.
We show that the system is capable of achieving a success rate of 81% in catching balls targeted at different locations, with a velocity of up to 13 m/s even on compute-constrained embedded platforms.
arXiv Detail & Related papers (2023-04-14T15:23:28Z) - Dual Memory Aggregation Network for Event-Based Object Detection with
Learnable Representation [79.02808071245634]
Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner.
Event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation.
Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars.
arXiv Detail & Related papers (2023-03-17T12:12:41Z) - Differentiable Frequency-based Disentanglement for Aerial Video Action
Recognition [56.91538445510214]
We present a learning algorithm for human activity recognition in videos.
Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras.
We conduct extensive experiments on the UAV Human dataset and the NEC Drone dataset.
arXiv Detail & Related papers (2022-09-15T22:16:52Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic
Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis.
The proposed dataset contains 100,000 images and 25 different object types.
We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z) - Moving Object Detection for Event-based vision using Graph Spectral
Clustering [6.354824287948164]
Moving object detection has been a central topic of discussion in computer vision for its wide range of applications.
We present an unsupervised Graph Spectral Clustering technique for Moving Object Detection in Event-based data.
We additionally show how the optimum number of moving objects can be automatically determined.
arXiv Detail & Related papers (2021-09-30T10:19:22Z) - An Analysis of Deep Object Detectors For Diver Detection [19.14344722263869]
We produce a dataset of approximately 105,000 annotated images of divers sourced from videos.
We train a variety of state-of-the-art deep neural networks for object detection, including SSD with Mobilenet, Faster R-CNN, and YOLO.
Based on our results, we recommend Tiny-YOLOv4 for real-time applications on robots.
arXiv Detail & Related papers (2020-11-25T01:50:32Z) - Fast Motion Understanding with Spatiotemporal Neural Networks and
Dynamic Vision Sensors [99.94079901071163]
This paper presents a Dynamic Vision Sensor (DVS) based system for reasoning about high speed motion.
We consider the case of a robot at rest reacting to a small, fast approaching object at speeds higher than 15m/s.
We highlight the results of our system to a toy dart moving at 23.4m/s with a 24.73deg error in $theta$, 18.4mm average discretized radius prediction error, and 25.03% median time to collision prediction error.
arXiv Detail & Related papers (2020-11-18T17:55:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.