3D-FFS: Faster 3D object detection with Focused Frustum Search in sensor
fusion based networks
- URL: http://arxiv.org/abs/2103.08294v1
- Date: Mon, 15 Mar 2021 11:32:21 GMT
- Title: 3D-FFS: Faster 3D object detection with Focused Frustum Search in sensor
fusion based networks
- Authors: Aniruddha Ganguly, Tasin Ishmam, Khandker Aftarul Islam, Md Zahidur
Rahman and Md. Shamsuzzoha Bayzid
- Abstract summary: We propose 3D-FFS, a novel approach to make sensor fusion based 3D object detection networks significantly faster.
3D-FFS can substantially constrain the 3D search space and thereby significantly reduce training time, inference time and memory consumption.
Compared to F-ConvNet, we achieve improvements in training and inference times by up to 62.84% and 56.46%, respectively, while reducing the memory usage by up to 58.53%.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work we propose 3D-FFS, a novel approach to make sensor fusion based
3D object detection networks significantly faster using a class of
computationally inexpensive heuristics. Existing sensor fusion based networks
generate 3D region proposals by leveraging inferences from 2D object detectors.
However, as images have no depth information, these networks rely on extracting
semantic features of points from the entire scene to locate the object. By
leveraging aggregated intrinsic properties (e.g. point density) of the 3D point
cloud data, 3D-FFS can substantially constrain the 3D search space and thereby
significantly reduce training time, inference time and memory consumption
without sacrificing accuracy. To demonstrate the efficacy of 3D-FFS, we have
integrated it with Frustum ConvNet (F-ConvNet), a prominent sensor fusion based
3D object detection model. We assess the performance of 3D-FFS on the KITTI
dataset. Compared to F-ConvNet, we achieve improvements in training and
inference times by up to 62.84% and 56.46%, respectively, while reducing the
memory usage by up to 58.53%. Additionally, we achieve 0.59%, 2.03% and 3.34%
improvements in accuracy for the Car, Pedestrian and Cyclist classes,
respectively. 3D-FFS shows a lot of promise in domains with limited computing
power, such as autonomous vehicles, drones and robotics where LiDAR-Camera
based sensor fusion perception systems are widely used.
Related papers
- HEDNet: A Hierarchical Encoder-Decoder Network for 3D Object Detection
in Point Clouds [19.1921315424192]
3D object detection in point clouds is important for autonomous driving systems.
A primary challenge in 3D object detection stems from the sparse distribution of points within the 3D scene.
We propose HEDNet, a hierarchical encoder-decoder network for 3D object detection.
arXiv Detail & Related papers (2023-10-31T07:32:08Z) - Multi-Sem Fusion: Multimodal Semantic Fusion for 3D Object Detection [11.575945934519442]
LiDAR and camera fusion techniques are promising for achieving 3D object detection in autonomous driving.
Most multi-modal 3D object detection frameworks integrate semantic knowledge from 2D images into 3D LiDAR point clouds.
We propose a general multi-modal fusion framework Multi-Sem Fusion (MSF) to fuse the semantic information from both the 2D image and 3D points scene parsing results.
arXiv Detail & Related papers (2022-12-10T10:54:41Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - Image-to-Lidar Self-Supervised Distillation for Autonomous Driving Data [80.14669385741202]
We propose a self-supervised pre-training method for 3D perception models tailored to autonomous driving data.
We leverage the availability of synchronized and calibrated image and Lidar sensors in autonomous driving setups.
Our method does not require any point cloud nor image annotations.
arXiv Detail & Related papers (2022-03-30T12:40:30Z) - VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and
Stereo Data Fusion [62.24001258298076]
VPFNet is a new architecture that cleverly aligns and aggregates the point cloud and image data at the virtual' points.
Our VPFNet achieves 83.21% moderate 3D AP and 91.86% moderate BEV AP on the KITTI test set, ranking the 1st since May 21th, 2021.
arXiv Detail & Related papers (2021-11-29T08:51:20Z) - RoIFusion: 3D Object Detection from LiDAR and Vision [7.878027048763662]
We propose a novel fusion algorithm by projecting a set of 3D Region of Interests (RoIs) from the point clouds to the 2D RoIs of the corresponding the images.
Our approach achieves state-of-the-art performance on the KITTI 3D object detection challenging benchmark.
arXiv Detail & Related papers (2020-09-09T20:23:27Z) - End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection [62.34374949726333]
Pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras.
PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs.
We introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end.
arXiv Detail & Related papers (2020-04-07T02:18:38Z) - Boundary-Aware Dense Feature Indicator for Single-Stage 3D Object
Detection from Point Clouds [32.916690488130506]
We propose a universal module that helps 3D detectors focus on the densest region of the point clouds in a boundary-aware manner.
Experiments on KITTI dataset show that DENFI improves the performance of the baseline single-stage detector remarkably.
arXiv Detail & Related papers (2020-04-01T01:21:23Z) - D3Feat: Joint Learning of Dense Detection and Description of 3D Local
Features [51.04841465193678]
We leverage a 3D fully convolutional network for 3D point clouds.
We propose a novel and practical learning mechanism that densely predicts both a detection score and a description feature for each 3D point.
Our method achieves state-of-the-art results in both indoor and outdoor scenarios.
arXiv Detail & Related papers (2020-03-06T12:51:09Z) - ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object
Detection [69.68263074432224]
We present a novel framework named ZoomNet for stereo imagery-based 3D detection.
The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes.
To further exploit the abundant texture cues in RGB images for more accurate disparity estimation, we introduce a conceptually straight-forward module -- adaptive zooming.
arXiv Detail & Related papers (2020-03-01T17:18:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.