Range Conditioned Dilated Convolutions for Scale Invariant 3D Object
Detection
- URL: http://arxiv.org/abs/2005.09927v3
- Date: Fri, 22 Jan 2021 14:52:57 GMT
- Title: Range Conditioned Dilated Convolutions for Scale Invariant 3D Object
Detection
- Authors: Alex Bewley, Pei Sun, Thomas Mensink, Dragomir Anguelov, Cristian
Sminchisescu
- Abstract summary: This paper presents a novel 3D object detection framework that processes LiDAR data directly on its native representation: range images.
Benefiting from the compactness of range images, 2D convolutions can efficiently process dense LiDAR data of a scene.
- Score: 41.59388513615775
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper presents a novel 3D object detection framework that processes
LiDAR data directly on its native representation: range images. Benefiting from
the compactness of range images, 2D convolutions can efficiently process dense
LiDAR data of a scene. To overcome scale sensitivity in this perspective view,
a novel range-conditioned dilation (RCD) layer is proposed to dynamically
adjust a continuous dilation rate as a function of the measured range.
Furthermore, localized soft range gating combined with a 3D box-refinement
stage improves robustness in occluded areas, and produces overall more accurate
bounding box predictions. On the public large-scale Waymo Open Dataset, our
method sets a new baseline for range-based 3D detection, outperforming
multiview and voxel-based methods over all ranges with unparalleled performance
at long range detection.
Related papers
- What Matters in Range View 3D Object Detection [15.147558647138629]
Lidar-based perception pipelines rely on 3D object detection models to interpret complex scenes.
We achieve state-of-the-art amongst range-view 3D object detection models without using multiple techniques proposed in past range-view literature.
arXiv Detail & Related papers (2024-07-23T18:42:37Z) - Uplifting Range-View-based 3D Semantic Segmentation in Real-Time with Multi-Sensor Fusion [18.431017678057348]
Range-View(RV)-based 3D point cloud segmentation is widely adopted due to its compact data form.
However, RV-based methods fall short in providing robust segmentation for the occluded points.
We propose a new LiDAR and Camera Range-view-based 3D point cloud semantic segmentation method (LaCRange)
In addition to being real-time, the proposed method achieves state-of-the-art results on nuScenes benchmark.
arXiv Detail & Related papers (2024-07-12T21:41:57Z) - Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene [22.297964850282177]
We propose LiDAR-2D Self-paced Learning (LiSe) for unsupervised 3D detection.
RGB images serve as a valuable complement to LiDAR data, offering precise 2D localization cues.
Our framework devises a self-paced learning pipeline that incorporates adaptive sampling and weak model aggregation strategies.
arXiv Detail & Related papers (2024-07-11T14:58:49Z) - DiffuBox: Refining 3D Object Detection with Point Diffusion [74.01759893280774]
We introduce a novel diffusion-based box refinement approach to ensure robust 3D object detection and localization.
We evaluate this approach under various domain adaptation settings, and our results reveal significant improvements across different datasets.
arXiv Detail & Related papers (2024-05-25T03:14:55Z) - Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments [67.83787474506073]
We tackle the limitations of current LiDAR-based 3D object detection systems.
We introduce a universal textscFind n' Propagate approach for 3D OV tasks.
We achieve up to a 3.97-fold increase in Average Precision (AP) for novel object classes.
arXiv Detail & Related papers (2024-03-20T12:51:30Z) - 3DifFusionDet: Diffusion Model for 3D Object Detection with Robust
LiDAR-Camera Fusion [6.914463996768285]
3DifFusionDet structures 3D object detection as a denoising diffusion process from noisy 3D boxes to target boxes.
Under the feature align strategy, the progressive refinement method could make a significant contribution to robust LiDAR-Camera fusion.
Experiments on KITTI, a benchmark for real-world traffic object identification, revealed that 3DifFusionDet is able to perform favorably in comparison to earlier, well-respected detectors.
arXiv Detail & Related papers (2023-11-07T05:53:09Z) - Improving LiDAR 3D Object Detection via Range-based Point Cloud Density
Optimization [13.727464375608765]
Existing 3D object detectors tend to perform well on the point cloud regions closer to the LiDAR sensor as opposed to on regions that are farther away.
We observe that there is a learning bias in detection models towards the dense objects near the sensor and show that the detection performance can be improved by simply manipulating the input point cloud density at different distance ranges.
arXiv Detail & Related papers (2023-06-09T04:11:43Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.