PAI3D: Painting Adaptive Instance-Prior for 3D Object Detection
- URL: http://arxiv.org/abs/2211.08055v1
- Date: Tue, 15 Nov 2022 11:15:25 GMT
- Title: PAI3D: Painting Adaptive Instance-Prior for 3D Object Detection
- Authors: Hao Liu, Zhuoran Xu, Dan Wang, Baofeng Zhang, Guan Wang, Bo Dong, Xin
Wen, and Xinyu Xu
- Abstract summary: Painting Adaptive Instance-prior for 3D object detection (PAI3D) is a sequential instance-level fusion framework.
It first extracts instance-level semantic information from images.
Extracted information, including objects categorical label, point-to-object membership and object position, are then used to augment each LiDAR point in the subsequent 3D detection network.
- Score: 22.41785292720421
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D object detection is a critical task in autonomous driving. Recently
multi-modal fusion-based 3D object detection methods, which combine the
complementary advantages of LiDAR and camera, have shown great performance
improvements over mono-modal methods. However, so far, no methods have
attempted to utilize the instance-level contextual image semantics to guide the
3D object detection. In this paper, we propose a simple and effective Painting
Adaptive Instance-prior for 3D object detection (PAI3D) to fuse instance-level
image semantics flexibly with point cloud features. PAI3D is a multi-modal
sequential instance-level fusion framework. It first extracts instance-level
semantic information from images, the extracted information, including objects
categorical label, point-to-object membership and object position, are then
used to augment each LiDAR point in the subsequent 3D detection network to
guide and improve detection performance. PAI3D outperforms the state-of-the-art
with a large margin on the nuScenes dataset, achieving 71.4 in mAP and 74.2 in
NDS on the test split. Our comprehensive experiments show that instance-level
image semantics contribute the most to the performance gain, and PAI3D works
well with any good-quality instance segmentation models and any modern point
cloud 3D encoders, making it a strong candidate for deployment on autonomous
vehicles.
Related papers
- VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection [80.62052650370416]
monocular 3D object detection holds significant importance across various applications, including autonomous driving and robotics.
In this paper, we present VFMM3D, an innovative framework that leverages the capabilities of Vision Foundation Models (VFMs) to accurately transform single-view images into LiDAR point cloud representations.
arXiv Detail & Related papers (2024-04-15T03:12:12Z) - ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance [76.7746870349809]
We present ComboVerse, a 3D generation framework that produces high-quality 3D assets with complex compositions by learning to combine multiple models.
Our proposed framework emphasizes spatial alignment of objects, compared with standard score distillation sampling.
arXiv Detail & Related papers (2024-03-19T03:39:43Z) - 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images.
We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image.
We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z) - SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection [19.75965521357068]
We propose a novel approach called SOGDet (Semantic-Occupancy Guided Multi-view 3D Object Detection) to improve the accuracy of 3D object detection.
Our results show that SOGDet consistently enhance the performance of three baseline methods in terms of nuScenes Detection Score (NDS) and mean Average Precision (mAP)
This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems.
arXiv Detail & Related papers (2023-08-26T07:38:21Z) - Multi-Sem Fusion: Multimodal Semantic Fusion for 3D Object Detection [11.575945934519442]
LiDAR and camera fusion techniques are promising for achieving 3D object detection in autonomous driving.
Most multi-modal 3D object detection frameworks integrate semantic knowledge from 2D images into 3D LiDAR point clouds.
We propose a general multi-modal fusion framework Multi-Sem Fusion (MSF) to fuse the semantic information from both the 2D image and 3D points scene parsing results.
arXiv Detail & Related papers (2022-12-10T10:54:41Z) - Paint and Distill: Boosting 3D Object Detection with Semantic Passing
Network [70.53093934205057]
3D object detection task from lidar or camera sensors is essential for autonomous driving.
We propose a novel semantic passing framework, named SPNet, to boost the performance of existing lidar-based 3D detection models.
arXiv Detail & Related papers (2022-07-12T12:35:34Z) - Image-to-Lidar Self-Supervised Distillation for Autonomous Driving Data [80.14669385741202]
We propose a self-supervised pre-training method for 3D perception models tailored to autonomous driving data.
We leverage the availability of synchronized and calibrated image and Lidar sensors in autonomous driving setups.
Our method does not require any point cloud nor image annotations.
arXiv Detail & Related papers (2022-03-30T12:40:30Z) - FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object
Detection [15.641616738865276]
We propose a general multimodal fusion framework FusionPainting to fuse the 2D RGB image and 3D point clouds at a semantic level for boosting the 3D object detection task.
Especially, the FusionPainting framework consists of three main modules: a multi-modal semantic segmentation module, an adaptive attention-based semantic fusion module, and a 3D object detector.
The effectiveness of the proposed framework has been verified on the large-scale nuScenes detection benchmark.
arXiv Detail & Related papers (2021-06-23T14:53:22Z) - IAFA: Instance-aware Feature Aggregation for 3D Object Detection from a
Single Image [37.83574424518901]
3D object detection from a single image is an important task in Autonomous Driving.
We propose an instance-aware approach to aggregate useful information for improving the accuracy of 3D object detection.
arXiv Detail & Related papers (2021-03-05T05:47:52Z) - PerMO: Perceiving More at Once from a Single Image for Autonomous
Driving [76.35684439949094]
We present a novel approach to detect, segment, and reconstruct complete textured 3D models of vehicles from a single image.
Our approach combines the strengths of deep learning and the elegance of traditional techniques.
We have integrated these algorithms with an autonomous driving system.
arXiv Detail & Related papers (2020-07-16T05:02:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.