Streaming Object Detection for 3-D Point Clouds
- URL: http://arxiv.org/abs/2005.01864v1
- Date: Mon, 4 May 2020 21:55:15 GMT
- Title: Streaming Object Detection for 3-D Point Clouds
- Authors: Wei Han, Zhengdong Zhang, Benjamin Caine, Brandon Yang, Christoph
Sprunk, Ouais Alsharif, Jiquan Ngiam, Vijay Vasudevan, Jonathon Shlens,
Zhifeng Chen
- Abstract summary: LiDAR provides a prominent sensory modality that informs many existing perceptual systems.
The latency for perceptual systems based on point cloud data can be dominated by the amount of time for a complete rotational scan.
We show how operating on LiDAR data in its native streaming formulation offers several advantages for self driving object detection.
- Score: 29.465873948076766
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous vehicles operate in a dynamic environment, where the speed with
which a vehicle can perceive and react impacts the safety and efficacy of the
system. LiDAR provides a prominent sensory modality that informs many existing
perceptual systems including object detection, segmentation, motion estimation,
and action recognition. The latency for perceptual systems based on point cloud
data can be dominated by the amount of time for a complete rotational scan
(e.g. 100 ms). This built-in data capture latency is artificial, and based on
treating the point cloud as a camera image in order to leverage camera-inspired
architectures. However, unlike camera sensors, most LiDAR point cloud data is
natively a streaming data source in which laser reflections are sequentially
recorded based on the precession of the laser beam. In this work, we explore
how to build an object detector that removes this artificial latency
constraint, and instead operates on native streaming data in order to
significantly reduce latency. This approach has the added benefit of reducing
the peak computational burden on inference hardware by spreading the
computation over the acquisition time for a scan. We demonstrate a family of
streaming detection systems based on sequential modeling through a series of
modifications to the traditional detection meta-architecture. We highlight how
this model may achieve competitive if not superior predictive performance with
state-of-the-art, traditional non-streaming detection systems while achieving
significant latency gains (e.g. 1/15'th - 1/3'rd of peak latency). Our results
show that operating on LiDAR data in its native streaming formulation offers
several advantages for self driving object detection -- advantages that we hope
will be useful for any LiDAR perception system where minimizing latency is
critical for safe and efficient operation.
Related papers
- Real-time Stereo-based 3D Object Detection for Streaming Perception [12.52037626475608]
We introduce StreamDSGN, the first real-time stereo-based 3D object detection framework designed for streaming perception.
StreamDSGN directly predicts the 3D properties of objects in the next moment by leveraging historical information.
Compared with the strong baseline, StreamDSGN significantly improves the streaming average precision by up to 4.33%.
arXiv Detail & Related papers (2024-10-16T09:23:02Z) - Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences [25.74000325019015]
We introduce a novel LiDAR 3D object detection framework, namely LiSTM, to facilitate spatial-temporal feature learning with cross-frame motion forecasting information.
We have conducted experiments on the aggregation and nuScenes datasets to demonstrate that the proposed framework achieves superior 3D detection performance.
arXiv Detail & Related papers (2024-09-06T16:29:04Z) - TimePillars: Temporally-Recurrent 3D LiDAR Object Detection [8.955064958311517]
TimePillars is a temporally-recurrent object detection pipeline.
It exploits the pillar representation of LiDAR data across time.
We show how basic building blocks are enough to achieve robust and efficient results.
arXiv Detail & Related papers (2023-12-22T10:25:27Z) - EV-Catcher: High-Speed Object Catching Using Low-latency Event-based
Neural Networks [107.62975594230687]
We demonstrate an application where event cameras excel: accurately estimating the impact location of fast-moving objects.
We introduce a lightweight event representation called Binary Event History Image (BEHI) to encode event data at low latency.
We show that the system is capable of achieving a success rate of 81% in catching balls targeted at different locations, with a velocity of up to 13 m/s even on compute-constrained embedded platforms.
arXiv Detail & Related papers (2023-04-14T15:23:28Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - StreamYOLO: Real-time Object Detection for Streaming Perception [84.2559631820007]
We endow the models with the capacity of predicting the future, significantly improving the results for streaming perception.
We consider multiple velocities driving scene and propose Velocity-awared streaming AP (VsAP) to jointly evaluate the accuracy.
Our simple method achieves the state-of-the-art performance on Argoverse-HD dataset and improves the sAP and VsAP by 4.7% and 8.2% respectively.
arXiv Detail & Related papers (2022-07-21T12:03:02Z) - Real-time Object Detection for Streaming Perception [84.2559631820007]
Streaming perception is proposed to jointly evaluate the latency and accuracy into a single metric for video online perception.
We build a simple and effective framework for streaming perception.
Our method achieves competitive performance on Argoverse-HD dataset and improves the AP by 4.9% compared to the strong baseline.
arXiv Detail & Related papers (2022-03-23T11:33:27Z) - Efficient and Robust LiDAR-Based End-to-End Navigation [132.52661670308606]
We present an efficient and robust LiDAR-based end-to-end navigation framework.
We propose Fast-LiDARNet that is based on sparse convolution kernel optimization and hardware-aware model design.
We then propose Hybrid Evidential Fusion that directly estimates the uncertainty of the prediction from only a single forward pass.
arXiv Detail & Related papers (2021-05-20T17:52:37Z) - Motion Vector Extrapolation for Video Object Detection [0.0]
MOVEX enables low latency video object detection on common CPU based systems.
We show that our approach significantly reduces the baseline latency of any given object detector.
Further latency reduction, up to 25x lower than the original latency, can be achieved with minimal accuracy loss.
arXiv Detail & Related papers (2021-04-18T17:26:37Z) - StrObe: Streaming Object Detection from LiDAR Packets [73.27333924964306]
Rolling shutter LiDARs emitted as a stream of packets, each covering a sector of the 360deg coverage.
Modern perception algorithms wait for the full sweep to be built before processing the data, which introduces an additional latency.
In this paper we propose StrObe, a novel approach that minimizes latency by ingesting LiDAR packets and emitting a stream of detections without waiting for the full sweep to be built.
arXiv Detail & Related papers (2020-11-12T14:57:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.