2nd Place Solution for Waymo Open Dataset Challenge - Real-time 2D
Object Detection
- URL: http://arxiv.org/abs/2106.08713v1
- Date: Wed, 16 Jun 2021 11:32:03 GMT
- Title: 2nd Place Solution for Waymo Open Dataset Challenge - Real-time 2D
Object Detection
- Authors: Yueming Zhang, Xiaolin Song, Bing Bai, Tengfei Xing, Chao Liu, Xin
Gao, Zhihui Wang, Yawei Wen, Haojin Liao, Guoshan Zhang, Pengfei Xu
- Abstract summary: In this report, we introduce a real-time method to detect the 2D objects from images.
We leverage accelerationRT to optimize the inference time of our detection pipeline.
Our framework achieves the latency of 45.8ms/frame on an Nvidia Tesla V100 GPU.
- Score: 26.086623067939605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In an autonomous driving system, it is essential to recognize vehicles,
pedestrians and cyclists from images. Besides the high accuracy of the
prediction, the requirement of real-time running brings new challenges for
convolutional network models. In this report, we introduce a real-time method
to detect the 2D objects from images. We aggregate several popular one-stage
object detectors and train the models of variety input strategies
independently, to yield better performance for accurate multi-scale detection
of each category, especially for small objects. For model acceleration, we
leverage TensorRT to optimize the inference time of our detection pipeline. As
shown in the leaderboard, our proposed detection framework ranks the 2nd place
with 75.00% L1 mAP and 69.72% L2 mAP in the real-time 2D detection track of the
Waymo Open Dataset Challenges, while our framework achieves the latency of
45.8ms/frame on an Nvidia Tesla V100 GPU.
Related papers
- Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for
Distracted Driver Action Recognition [8.841708075914353]
Temporal localization of driving actions over time is important for advanced driver-assistance systems and naturalistic driving studies.
We aim to improve the temporal localization and classification accuracy performance by adapting video action recognition and 2D human-based estimation networks to one model.
The model performs well on the A2 test set the 2023 NVIDIA AI City Challenge for naturalistic driving action recognition.
arXiv Detail & Related papers (2024-03-11T10:26:38Z) - ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every
Detection Box [81.45219802386444]
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects across video frames.
We propose a hierarchical data association strategy to mine the true objects in low-score detection boxes.
In 3D scenarios, it is much easier for the tracker to predict object velocities in the world coordinate.
arXiv Detail & Related papers (2023-03-27T15:35:21Z) - Optimizing Anchor-based Detectors for Autonomous Driving Scenes [22.946814647030667]
This paper summarizes model improvements and inference-time optimizations for the popular anchor-based detectors in autonomous driving scenes.
Based on the high-performing RCNN-RS and RetinaNet-RS detection frameworks, we study a set of framework improvements to adapt the detectors to better detect small objects in crowd scenes.
arXiv Detail & Related papers (2022-08-11T22:44:59Z) - AutoAlignV2: Deformable Feature Aggregation for Dynamic Multi-Modal 3D
Object Detection [17.526914782562528]
We propose AutoAlignV2, a faster and stronger multi-modal 3D detection framework, built on top of AutoAlign.
Our best model reaches 72.4 NDS on nuScenes test leaderboard, achieving new state-of-the-art results.
arXiv Detail & Related papers (2022-07-21T06:17:23Z) - Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object
Detection [58.81316192862618]
Two critical sensors for 3D perception in autonomous driving are the camera and the LiDAR.
fusing these two modalities can significantly boost the performance of 3D perception models.
We benchmark the state-of-the-art fusion methods for the first time.
arXiv Detail & Related papers (2022-05-30T09:35:37Z) - Workshop on Autonomous Driving at CVPR 2021: Technical Report for
Streaming Perception Challenge [57.647371468876116]
We introduce our real-time 2D object detection system for the realistic autonomous driving scenario.
Our detector is built on a newly designed YOLO model, called YOLOX.
On the Argoverse-HD dataset, our system achieves 41.0 streaming AP, which surpassed second place by 7.8/6.1 on detection-only track/fully track, respectively.
arXiv Detail & Related papers (2021-07-27T06:36:06Z) - Achieving Real-Time Object Detection on MobileDevices with Neural
Pruning Search [45.20331644857981]
We propose a compiler-aware neural pruning search framework to achieve high-speed inference on autonomous vehicles for 2D and 3D object detection.
For the first time, the proposed method achieves computation (close-to) real-time, 55ms and 99ms inference times for YOLOv4 based 2D object detection and PointPillars based 3D detection.
arXiv Detail & Related papers (2021-06-28T18:59:20Z) - SODA10M: Towards Large-Scale Object Detection Benchmark for Autonomous
Driving [94.11868795445798]
We release a Large-Scale Object Detection benchmark for Autonomous driving, named as SODA10M, containing 10 million unlabeled images and 20K images labeled with 6 representative object categories.
To improve diversity, the images are collected every ten seconds per frame within 32 different cities under different weather conditions, periods and location scenes.
We provide extensive experiments and deep analyses of existing supervised state-of-the-art detection models, popular self-supervised and semi-supervised approaches, and some insights about how to develop future models.
arXiv Detail & Related papers (2021-06-21T13:55:57Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - 2nd Place Solution for Waymo Open Dataset Challenge -- 2D Object
Detection [7.807118356899879]
This report introduces a state-of-the-art 2D object detection system for autonomous driving scenarios.
We integrate both popular two-stage detector and one-stage detector with anchor free fashion to yield a robust detection.
arXiv Detail & Related papers (2020-06-28T04:50:16Z) - DR-SPAAM: A Spatial-Attention and Auto-regressive Model for Person
Detection in 2D Range Data [81.06749792332641]
We propose a person detection network which uses an alternative strategy to combine scans obtained at different times.
DR-SPAAM keeps the intermediate features from the backbone network as a template and recurrently updates the template when a new scan becomes available.
On the DROW dataset, our method outperforms the existing state-of-the-art, while being approximately four times faster.
arXiv Detail & Related papers (2020-04-29T11:01:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.