LIGA-Stereo: Learning LiDAR Geometry Aware Representations for
Stereo-based 3D Detector
- URL: http://arxiv.org/abs/2108.08258v1
- Date: Wed, 18 Aug 2021 17:24:40 GMT
- Title: LIGA-Stereo: Learning LiDAR Geometry Aware Representations for
Stereo-based 3D Detector
- Authors: Xiaoyang Guo, Shaoshuai Shi, Xiaogang Wang, Hongsheng Li
- Abstract summary: We propose LIGA-Stereo to learn stereo-based 3D detectors under the guidance of high-level geometry-aware representations of LiDAR-based detection models.
Compared with the state-of-the-art stereo detector, our method has improved the 3D detection performance of cars, pedestrians, cyclists by 10.44%, 5.69%, 5.97% mAP respectively.
- Score: 80.7563981951707
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stereo-based 3D detection aims at detecting 3D object bounding boxes from
stereo images using intermediate depth maps or implicit 3D geometry
representations, which provides a low-cost solution for 3D perception. However,
its performance is still inferior compared with LiDAR-based detection
algorithms. To detect and localize accurate 3D bounding boxes, LiDAR-based
models can encode accurate object boundaries and surface normal directions from
LiDAR point clouds. However, the detection results of stereo-based detectors
are easily affected by the erroneous depth features due to the limitation of
stereo matching. To solve the problem, we propose LIGA-Stereo (LiDAR Geometry
Aware Stereo Detector) to learn stereo-based 3D detectors under the guidance of
high-level geometry-aware representations of LiDAR-based detection models. In
addition, we found existing voxel-based stereo detectors failed to learn
semantic features effectively from indirect 3D supervisions. We attach an
auxiliary 2D detection head to provide direct 2D semantic supervisions.
Experiment results show that the above two strategies improved the geometric
and semantic representation capabilities. Compared with the state-of-the-art
stereo detector, our method has improved the 3D detection performance of cars,
pedestrians, cyclists by 10.44%, 5.69%, 5.97% mAP respectively on the official
KITTI benchmark. The gap between stereo-based and LiDAR-based 3D detectors is
further narrowed.
Related papers
- DSGN++: Exploiting Visual-Spatial Relation forStereo-based 3D Detectors [60.88824519770208]
Camera-based 3D object detectors are welcome due to their wider deployment and lower price than LiDAR sensors.
We revisit the prior stereo modeling DSGN about the stereo volume constructions for representing both 3D geometry and semantics.
We propose our approach, DSGN++, aiming for improving information flow throughout the 2D-to-3D pipeline.
arXiv Detail & Related papers (2022-04-06T18:43:54Z) - Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving [14.582107328849473]
The gap in image-to-image generation for stereo views is much smaller than that in image-to-LiDAR generation.
Motivated by this, we propose a Pseudo-Stereo 3D detection framework with three novel virtual view generation methods.
Our framework ranks 1st on car, pedestrian, and cyclist among the monocular 3D detectors with publications on the KITTI-3D benchmark.
arXiv Detail & Related papers (2022-03-04T03:00:34Z) - MonoDistill: Learning Spatial Features for Monocular 3D Object Detection [80.74622486604886]
We propose a simple and effective scheme to introduce the spatial information from LiDAR signals to the monocular 3D detectors.
We use the resulting data to train a 3D detector with the same architecture as the baseline model.
Experimental results show that the proposed method can significantly boost the performance of the baseline model.
arXiv Detail & Related papers (2022-01-26T09:21:41Z) - EGFN: Efficient Geometry Feature Network for Fast Stereo 3D Object
Detection [51.52496693690059]
Fast stereo based 3D object detectors lag far behind high-precision oriented methods in accuracy.
We argue that the main reason is the missing or poor 3D geometry feature representation in fast stereo based methods.
The proposed EGFN outperforms YOLOStsereo3D, the advanced fast method, by 5.16% on mAP$_3d$ at the cost of merely additional 12 ms.
arXiv Detail & Related papers (2021-11-28T05:25:36Z) - Stereo CenterNet based 3D Object Detection for Autonomous Driving [2.508414661327797]
We propose a 3D object detection method using geometric information in stereo images, called Stereo CenterNet.
Stereo CenterNet predicts the four semantic key points of the 3D bounding box of the object in space and uses 2D left right boxes, 3D dimension, orientation and key points to restore the bounding box of the object in the 3D space.
Experiments conducted on the KITTI dataset show that our method achieves the best speed-accuracy trade-off compared with the state-of-the-art methods based on stereo geometry.
arXiv Detail & Related papers (2021-03-20T02:18:49Z) - YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection [6.5702792909006735]
YOLOStereo3D is trained on one single GPU and runs at more than ten fps.
It demonstrates performance comparable to state-of-the-art stereo 3D detection frameworks without usage of LiDAR data.
arXiv Detail & Related papers (2021-03-17T03:43:54Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - DSGN: Deep Stereo Geometry Network for 3D Object Detection [79.16397166985706]
There is a large performance gap between image-based and LiDAR-based 3D object detectors.
Our method, called Deep Stereo Geometry Network (DSGN), significantly reduces this gap.
For the first time, we provide a simple and effective one-stage stereo-based 3D detection pipeline.
arXiv Detail & Related papers (2020-01-10T11:44:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.