DSGN: Deep Stereo Geometry Network for 3D Object Detection
- URL: http://arxiv.org/abs/2001.03398v3
- Date: Wed, 8 Apr 2020 03:30:28 GMT
- Title: DSGN: Deep Stereo Geometry Network for 3D Object Detection
- Authors: Yilun Chen, Shu Liu, Xiaoyong Shen, Jiaya Jia
- Abstract summary: There is a large performance gap between image-based and LiDAR-based 3D object detectors.
Our method, called Deep Stereo Geometry Network (DSGN), significantly reduces this gap.
For the first time, we provide a simple and effective one-stage stereo-based 3D detection pipeline.
- Score: 79.16397166985706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most state-of-the-art 3D object detectors heavily rely on LiDAR sensors
because there is a large performance gap between image-based and LiDAR-based
methods. It is caused by the way to form representation for the prediction in
3D scenarios. Our method, called Deep Stereo Geometry Network (DSGN),
significantly reduces this gap by detecting 3D objects on a differentiable
volumetric representation -- 3D geometric volume, which effectively encodes 3D
geometric structure for 3D regular space. With this representation, we learn
depth information and semantic cues simultaneously. For the first time, we
provide a simple and effective one-stage stereo-based 3D detection pipeline
that jointly estimates the depth and detects 3D objects in an end-to-end
learning manner. Our approach outperforms previous stereo-based 3D detectors
(about 10 higher in terms of AP) and even achieves comparable performance with
several LiDAR-based methods on the KITTI 3D object detection leaderboard. Our
code is publicly available at https://github.com/chenyilun95/DSGN.
Related papers
- 3D Small Object Detection with Dynamic Spatial Pruning [62.72638845817799]
We propose an efficient feature pruning strategy for 3D small object detection.
We present a multi-level 3D detector named DSPDet3D which benefits from high spatial resolution.
It takes less than 2s to directly process a whole building consisting of more than 4500k points while detecting out almost all objects.
arXiv Detail & Related papers (2023-05-05T17:57:04Z) - DSGN++: Exploiting Visual-Spatial Relation forStereo-based 3D Detectors [60.88824519770208]
Camera-based 3D object detectors are welcome due to their wider deployment and lower price than LiDAR sensors.
We revisit the prior stereo modeling DSGN about the stereo volume constructions for representing both 3D geometry and semantics.
We propose our approach, DSGN++, aiming for improving information flow throughout the 2D-to-3D pipeline.
arXiv Detail & Related papers (2022-04-06T18:43:54Z) - EGFN: Efficient Geometry Feature Network for Fast Stereo 3D Object
Detection [51.52496693690059]
Fast stereo based 3D object detectors lag far behind high-precision oriented methods in accuracy.
We argue that the main reason is the missing or poor 3D geometry feature representation in fast stereo based methods.
The proposed EGFN outperforms YOLOStsereo3D, the advanced fast method, by 5.16% on mAP$_3d$ at the cost of merely additional 12 ms.
arXiv Detail & Related papers (2021-11-28T05:25:36Z) - Voxel-based 3D Detection and Reconstruction of Multiple Objects from a
Single Image [22.037472446683765]
We learn a regular grid of 3D voxel features from the input image which is aligned with 3D scene space via a 3D feature lifting operator.
Based on the 3D voxel features, our novel CenterNet-3D detection head formulates the 3D detection as keypoint detection in the 3D space.
We devise an efficient coarse-to-fine reconstruction module, including coarse-level voxelization and a novel local PCA-SDF shape representation.
arXiv Detail & Related papers (2021-11-04T18:30:37Z) - LIGA-Stereo: Learning LiDAR Geometry Aware Representations for
Stereo-based 3D Detector [80.7563981951707]
We propose LIGA-Stereo to learn stereo-based 3D detectors under the guidance of high-level geometry-aware representations of LiDAR-based detection models.
Compared with the state-of-the-art stereo detector, our method has improved the 3D detection performance of cars, pedestrians, cyclists by 10.44%, 5.69%, 5.97% mAP respectively.
arXiv Detail & Related papers (2021-08-18T17:24:40Z) - YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection [6.5702792909006735]
YOLOStereo3D is trained on one single GPU and runs at more than ten fps.
It demonstrates performance comparable to state-of-the-art stereo 3D detection frameworks without usage of LiDAR data.
arXiv Detail & Related papers (2021-03-17T03:43:54Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency
Embedding Space for Autonomous Driving [3.222802562733787]
We propose an efficient and accurate 3D object detection method from stereo images, named RTS3D.
Experiments on the KITTI benchmark show that RTS3D is the first true real-time system for stereo image 3D detection.
arXiv Detail & Related papers (2020-12-30T07:56:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.