Shape Prior Non-Uniform Sampling Guided Real-time Stereo 3D Object
Detection
- URL: http://arxiv.org/abs/2106.10013v3
- Date: Tue, 22 Jun 2021 03:35:10 GMT
- Title: Shape Prior Non-Uniform Sampling Guided Real-time Stereo 3D Object
Detection
- Authors: Aqi Gao, Jiale Cao, Yanwei Pang
- Abstract summary: Recently introduced RTS3D builds an efficient 4D Feature-Consistency Embedding space for the intermediate representation of object without depth supervision.
We propose a shape prior non-uniform sampling strategy that performs dense sampling in outer region and sparse sampling in inner region.
Our proposed method has 2.57% improvement on AP3d almost without extra network parameters.
- Score: 59.765645791588454
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pseudo-LiDAR based 3D object detectors have gained popularity due to their
high accuracy. However, these methods need dense depth supervision and suffer
from inferior speed. To solve these two issues, a recently introduced RTS3D
builds an efficient 4D Feature-Consistency Embedding (FCE) space for the
intermediate representation of object without depth supervision. FCE space
splits the entire object region into 3D uniform grid latent space for feature
sampling point generation, which ignores the importance of different object
regions. However, we argue that, compared with the inner region, the outer
region plays a more important role for accurate 3D detection. To encode more
information from the outer region, we propose a shape prior non-uniform
sampling strategy that performs dense sampling in outer region and sparse
sampling in inner region. As a result, more points are sampled from the outer
region and more useful features are extracted for 3D detection. Further, to
enhance the feature discrimination of each sampling point, we propose a
high-level semantic enhanced FCE module to exploit more contextual information
and suppress noise better. Experiments on the KITTI dataset are performed to
show the effectiveness of the proposed method. Compared with the baseline
RTS3D, our proposed method has 2.57% improvement on AP3d almost without extra
network parameters. Moreover, our proposed method outperforms the
state-of-the-art methods without extra supervision at a real-time speed.
Related papers
- AVS-Net: Point Sampling with Adaptive Voxel Size for 3D Scene Understanding [16.03214439663472]
This paper presents an advanced sampler that achieves both high accuracy and efficiency.
We propose a Voxel Adaptation Module that adaptively adjusts voxel sizes with the reference of point-based downsampling ratio.
Compared to existing state-of-the-art methods, our approach achieves better accuracy on outdoor and indoor large-scale datasets.
arXiv Detail & Related papers (2024-02-27T14:05:05Z) - 3D Small Object Detection with Dynamic Spatial Pruning [62.72638845817799]
We propose an efficient feature pruning strategy for 3D small object detection.
We present a multi-level 3D detector named DSPDet3D which benefits from high spatial resolution.
It takes less than 2s to directly process a whole building consisting of more than 4500k points while detecting out almost all objects.
arXiv Detail & Related papers (2023-05-05T17:57:04Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - EGFN: Efficient Geometry Feature Network for Fast Stereo 3D Object
Detection [51.52496693690059]
Fast stereo based 3D object detectors lag far behind high-precision oriented methods in accuracy.
We argue that the main reason is the missing or poor 3D geometry feature representation in fast stereo based methods.
The proposed EGFN outperforms YOLOStsereo3D, the advanced fast method, by 5.16% on mAP$_3d$ at the cost of merely additional 12 ms.
arXiv Detail & Related papers (2021-11-28T05:25:36Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency
Embedding Space for Autonomous Driving [3.222802562733787]
We propose an efficient and accurate 3D object detection method from stereo images, named RTS3D.
Experiments on the KITTI benchmark show that RTS3D is the first true real-time system for stereo image 3D detection.
arXiv Detail & Related papers (2020-12-30T07:56:37Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - Stereo RGB and Deeper LIDAR Based Network for 3D Object Detection [40.34710686994996]
3D object detection has become an emerging task in autonomous driving scenarios.
Previous works process 3D point clouds using either projection-based or voxel-based models.
We propose the Stereo RGB and Deeper LIDAR framework which can utilize semantic and spatial information simultaneously.
arXiv Detail & Related papers (2020-06-09T11:19:24Z) - Boundary-Aware Dense Feature Indicator for Single-Stage 3D Object
Detection from Point Clouds [32.916690488130506]
We propose a universal module that helps 3D detectors focus on the densest region of the point clouds in a boundary-aware manner.
Experiments on KITTI dataset show that DENFI improves the performance of the baseline single-stage detector remarkably.
arXiv Detail & Related papers (2020-04-01T01:21:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.