Multi-Task Multi-Sensor Fusion for 3D Object Detection
- URL: http://arxiv.org/abs/2012.12397v1
- Date: Tue, 22 Dec 2020 22:49:15 GMT
- Title: Multi-Task Multi-Sensor Fusion for 3D Object Detection
- Authors: Ming Liang, Bin Yang, Yun Chen, Rui Hu, Raquel Urtasun
- Abstract summary: We present an end-to-end learnable architecture that reasons about 2D and 3D object detection as well as ground estimation and depth completion.
Our experiments show that all these tasks are complementary and help the network learn better representations by fusing information at various levels.
- Score: 93.68864606959251
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper we propose to exploit multiple related tasks for accurate
multi-sensor 3D object detection. Towards this goal we present an end-to-end
learnable architecture that reasons about 2D and 3D object detection as well as
ground estimation and depth completion. Our experiments show that all these
tasks are complementary and help the network learn better representations by
fusing information at various levels. Importantly, our approach leads the KITTI
benchmark on 2D, 3D and BEV object detection, while being real time.
Related papers
- OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection [102.0744303467713]
We propose a new multi-view 3D object detector named OPEN.
Our main idea is to effectively inject object-wise depth information into the network through our proposed object-wise position embedding.
OPEN achieves a new state-of-the-art performance with 64.4% NDS and 56.7% mAP on the nuScenes test benchmark.
arXiv Detail & Related papers (2024-07-15T14:29:15Z) - Perspective-aware Convolution for Monocular 3D Object Detection [2.33877878310217]
We propose a novel perspective-aware convolutional layer that captures long-range dependencies in images.
By enforcing convolutional kernels to extract features along the depth axis of every image pixel, we incorporates perspective information into network architecture.
We demonstrate improved performance on the KITTI3D dataset, achieving a 23.9% average precision in the easy benchmark.
arXiv Detail & Related papers (2023-08-24T17:25:36Z) - AOP-Net: All-in-One Perception Network for Joint LiDAR-based 3D Object
Detection and Panoptic Segmentation [9.513467995188634]
AOP-Net is a LiDAR-based multi-task framework that combines 3D object detection and panoptic segmentation.
The AOP-Net achieves state-of-the-art performance for published works on the nuScenes benchmark for both 3D object detection and panoptic segmentation tasks.
arXiv Detail & Related papers (2023-02-02T05:31:53Z) - CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection [57.44434974289945]
We propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework.
Our framework takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene.
In addition to 3D object detection, we investigate the effectiveness of our framework for the problem of 3D object counting.
arXiv Detail & Related papers (2022-09-13T05:26:09Z) - A Simple Baseline for Multi-Camera 3D Object Detection [94.63944826540491]
3D object detection with surrounding cameras has been a promising direction for autonomous driving.
We present SimMOD, a Simple baseline for Multi-camera Object Detection.
We conduct extensive experiments on the 3D object detection benchmark of nuScenes to demonstrate the effectiveness of SimMOD.
arXiv Detail & Related papers (2022-08-22T03:38:01Z) - Deep Continuous Fusion for Multi-Sensor 3D Object Detection [103.5060007382646]
We propose a novel 3D object detector that can exploit both LIDAR as well as cameras to perform very accurate localization.
We design an end-to-end learnable architecture that exploits continuous convolutions to fuse image and LIDAR feature maps at different levels of resolution.
arXiv Detail & Related papers (2020-12-20T18:43:41Z) - Cross-Modality 3D Object Detection [63.29935886648709]
We present a novel two-stage multi-modal fusion network for 3D object detection.
The whole architecture facilitates two-stage fusion.
Our experiments on the KITTI dataset show that the proposed multi-stage fusion helps the network to learn better representations.
arXiv Detail & Related papers (2020-08-16T11:01:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.